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two more groups they extended the path down around the bottom 
end of the maze and up on the left side, leaving the food-box 
opposite the third alley. This study failed to reveal any tendency 
to orientation, but it did yield a very clear turning gradient which 
included the fifth point, the critical part of the gradient. The mean 
per cents of right-hand turns made by all four of their groups when 
combined were: 


#1 = 53.4; #2 = 70.9; #3 = 77.7; #4 = 86.1; #5 = 90.1. 

We accordingly conclude that Theorems 115 and 116 are both in 
agreement with empirical fact. 

But the turning tendency itself will naturally be a positive func- 
tion of the goal gradient; i.e., it will be a function of the distance 
the reinforcement of the reaction in question is from the goal. 

This, coupled with the considerations leading to Theorem 115, 
leads to our next theorem: 


THEOREM 117. Othtr things constant, the antedating turning 
ten ency will be weaker Ike farther away from the goal the point of 
reinforcement of the original turning movement in question is by direct 
measurement. 


TW™ no empirical evidence bearing on this corollary. 
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From these considerations we arrive at our next theorem: 

THEOREM 118. Organisms which frequently enter a given blind 
alley will, on the average, enter less and less deeply as practice continues, 
later merely pausing or slowing down, and finally running the true 
path without interruption. 

Peterson gave special attention to the depth of entrance into 
blind alleys as learning progresses. In regard to this he stated {17, 
p. 32): 

The elimination of entrances to blind alleys does not come 
about mainly by a decrease in the number of entrances, but 
principally, especially in the case of the longer cul de sacs, by a 
gradual decrease in the degree, or the distance, of entrance. 
An illustration of this is found in two detailed records reported 
by Peterson as presumably typical of blind-alley entrance elimina- 
tion in a difficult maze having a total of 124 blind alleys. This 
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^*oure 70. Graphic representation of the progressive ihortcning of the distance 
«ntercd into blind alleys as training continues. Plotted from computations based on 
Peterson’s published data from two typical rats on the same blind alley (/7, p. 28). 

niay be seen graphically in Figure 70, which shows that the number 
of complete entrances decreased as training continued, the number 
of partial entrances increased up to the third fourth of the errors 
and then decreased, the number of mere head and forefeet 
entrances increased to the fourth fourth of the errors made, and 



A BEHAVIOR SYSTEM 


300 

after this errors of all sorts ceased. On the basis of the above 
results, counting a partial entrance as one-tenth of a complete 
entrance, the mean depths of entrance of all the errors were respec- 
tively, 92.2 per cent, 81.6 per cent, 74.8 per cent, and 59.3 per cent. 
These data, together with completely confirming results pub- 
lished by Reynolds (75), abundantly substantiate Theorem 118 
empirically. 

The Experimental Extinction of Blind-Altey Entrance 
Ordinarily the most direct path to a spatial goal is the one which 
will receive reinforcement. However, it may happen that only an 
indirect or long way to a goal will be reinforced. Mazes of this 
sort were used by Higginson (7), Valentine (26), and GengerelU 
(4). A diagram of a modified form of Valentine’s maze, used by 
Reynolds, is reproduced as Figure 71. The solid-line pathway in 
this figure represents the most direct path to the goal, whereas the 
broken line represents the only path which was reinforced. The 
door at X was closed until the animal had passed at least its head 
and forepaws over the line at Y before proceeding back through 
E; then the door at X was opened, permitting access to G and the 
food. 

According to the principle of the spatial habit-family hierarchy, 
it is to be expected that after successfully completing the long path 
to the food at G in Figure 71 a very few times, the organism would 
make persistent attempts to go directly to G by the short path, 
n the basis of these considerations we arrive at our next theorem: 

THEOREM 119. When a naive organism reaches the goal in a maze 
traversing what would ordinarily be a 
in a ey, it will begin to show a marked tendency to short-circuit 
t^hng path even though the short path has never before been taken as 


Reynolds reported that in the situation represented by Figure 71 
into^th™^ * tittempted to take the short-circuiting path by turning 
ZJh ■ ^ ‘O 'he food m, ever 

aWth "Ws reinforced afte. 

tak mg the long pa h. The animals also attempted very persistently, 
m traversing the long path, to turn around before'^rLching tht 
hue at Y. reqmrtng a very large number of trials to eliminate thest 
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tendencies entirely (xvii). Reynolds reported similar results from a 
second investigation (2(?, p. 275), in which her eight animals took 
on the average 231 trials to reach her training criterion. This was 
a truly enormous number of trials for learning such a simple habit. 
Thus Theorem 119 appears to be empirically substantiated. 

Where experimental extinction is occurring by massed practice, 
which is followed by a period of no-practice, (2(7, pp. 279 ff.) the 



figure 71. Diagram of the second Reynolds mare. The grid was removed from the 
roaze except on the runs in which shocks were to be used for purposes of disinhibition. 
Reproduced from Reynolds (20, p. 274). 

no-practice will be associated with spontaneous recovery of the 
reaction tendency. 

Generalizing on these considerations we arrive at our next 
theorem: 

theorem 120 . naive organisms are trained by massed 

practice to traverse the '’'’blind alley'* of a maze in order to reach the 
ioalj the tendency to attempt short-circuiting will undergo experimental 
extinction during the massed practice and a period of no-practice will 
produce spontaneous recovery. 

Miss Reynolds carried out her learning experiments by massed 
trials. In the one best illustrating the present theorem, twenty 
Consecutive trials were given each day on the apparatus represented 
in Figure 71, the grid being removed. During the first ten triab 
the animals made a mean number of 25 attempts to go first to 
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by turning in at X before going up to Y;® during the second half 
of the trials that same day the mean number of such attempts was 
9; during the first ten trials on the following day, after some 23 
hours of no practice, the mean number was 18. Now, the fall of the 
second ten trials below the first ten on day 1 looks definitely like 
experimental extinction, and the rise of the curve at the first ten 
trials on the following day presents the picture of spontaneous 
recovery, because animals do not show anything like this amount of 
ordinary forgetting in 23 hours. This in turn indicates that the 
original loss was, in the main, not ordinary learning but genuine 
experimental extinction. We accordingly conclude that Theorem 
120 has an empirical substantiation. 

The introduction of an unusual or startling stimulus will cause 
the disinhibition (20, p. 278) of the internal inhibition which 
produces the experimental extinction just reported. Generalizing 
on this consideration, we arrive at our next theorem: 


THEOREM 121. When naive organisms are trained to traverse the 
blind alUy* of a maze in order to reach the goal, an unusual or 
startling stimulus introduced just at the entrance to a short-circuiting 
path will tend to produce a resumption oj a previously extinguished 
tendency to short-circuit the ^^blind alley.** 


three experiments based on the maze represented in 
igure 71, Reynolds investigated the question of the disinhibition 
of extinction effects (79; 20). The disinhiblting agent in two of the 
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It is well known that the effect of disinhibition is a relatively 
transitory phenomenon (75, p. 65). From this and the preceding 
considerations we arrive at our next theorem: 

THEOREM 122. fVAen naive organisms are trained to traverse a 
^^blind alleys of a maze in order to reach a goal, and disinhibition 
of an extinguished short-circuiting tendency is produced by an unusual 
stimulus, the tendency to omit the ^^blind alleys will be spontaneously 
lost soon after the disinhibiting stimulus ceases to operate. 

This question also was considered by Reynolds. In the experi- 
ment in which a curtain served as a disinhibiting agent, all of the 
seven animals used chose the long path without attempting to turn 
m at E (Figure 71), immediately after the disinhibiting process had 
occurred. Tendencies to this type of spontaneous recovery also 
appeared when an electric shock was used as the disinhibiting agent, 
though here the recovery was much longer delayed and was less 
complete. This apparently substantiates Theorem 122. 

Summary 

An analysis of multidirectional maze learning involves the opera- 
tion of several major principles: the goal gradient, the spatial habit- 
family hierarchy together with goal orientation, anticipatory turn- 
ing in the maze, and experimental extinction. 

The goal-gradient principle contributes to this learning by giving 
a special additional strength to the reaction potential of the shorter 
of every set of alternative paths. By the same action this principle 
mediates the elimination of long blind alleys more easily than short 
ones; the elimination of the last blind more easily than the first 
blind; and in general the backward order of the elimination of 
blind alleys. It also mediates the easier learning of short mazes 

Compared with long ones; the rise of the speed-of-locomotion 
gradient in passage through a maze; and the increase in the rate 
of rise in the curve of probability of correct choices from the begin- 
ning to the goal end of the maze. In general, all of these deductions 
agree with observation except the last, and appropriate empirical 
test data of this have not been found. 

In multidirectional maze learning we also find a special case 
of the spatial habit-family hierarchy principle, one major sub- 
Principle of which is the preference of the shortest available path 
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by Henry Etta Reynolds. She found (1) that a tendency to enter 
a special type of blind alley pointing toward the goal decreased 
with massed practice; (2) that there was in part a spontaneous 
recovery of the tendency with 23 hours of no practice; (3) that 
when completed after long training the extinction underwent 
disinhibition as the result of a slight but novel stimulus; and (4) 
that this disinhibition disappeared on an immediately following 
trial. This combination of phenomena conforms exactly to the 
classical Pavlovian picture, thus making doubly convincing the 
interpretation that experimental extinction contributes to maze 
learning. 

Terminal Notes 

THE VALUE OF (TflOR USED IN TABLE 33 

Since the value of a’gOji is the unit by which reaction potential is 
measured, in such situations it should be 1.00. Actually the value 
chosen in the present exposition, e.g., on p. 277 and in Table 33, 
is .3012. The reason for taking this marked deviation from the 
theoretical value of tTgOn is that a value of approximately this 
magnitude had to be used in order to secure something like usual 
blind-alley elimination scores while using the present indications 
of maximum bEr, i.e., M and the factor of reduction (F = Ko)- 
The cause of this necessity probably is, as pointed out earlier in this 
chapter, that the goal gradient is only one of several factors opera- 
tive in the process of blind-alley elimination. An additional factor 
of major importance not taken into consideration in the computa- 
tions in question is believed to be that of experimental extinction. 
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they begin as simple conditioned reflexes and later extend to be- 
havior chains of various lengths with definite goals (ra’s) and goal 
stimuli (so’s) (2). 

The Problem of Locomotor "Insight” Posed 

At this point there arises a critical question in behavior theory 
which has been debated for thirty years or so. Is there a single and 
distinct behavioral element variously called insight or intelligence, 
which aids in the orderly assembly of chain segments beyond the 
limitations of chance suggested above? This is to ask whether there 
is a peculiar mechanism called insight, or intelligence, which has 
the power of joining, i.e., spontaneously organizing, two behavior- 
chain segments previously learned on separate occasions so that 
together they will solve a problem faced by the organism at a later 
time (P, p. 46). For example, let us suppose that in a maze such as 
that represented in Figure 72 an organism has learned on one 
occasion the path J to L, with a large food reward; on a separate 
occasion, the path H to J, with a small food reward; and on a third 
occasion, the path H to N with a similarly small food reward. 
Following this preliminary training, the hungry animal is placed 
at H, Will the animal go (1) to N and get a small amount of food? 
Or will it go (2) to J, get a small amount of food, and thence to 
L where it will find considerably more food? In terms of behavior 
chain segments, will the fact that the animal possesses the heavily 
rewarded behavior segment JL add weight to the choice of the 
path HJ versus the path HN beyond the normal or chance per 
Cent of choices? In the experimental investigation of such a problem 
it Would, of course, be necessary to determine with care the per 
cent of the particular subject’s choices of paths HJ and HN before 
beginning the training on JL. 

It is perfectly obvious that normally intelligent humans would 
choose path HJL rather than path HN. How far down in the animal 
scale this capacity extends remains to be determined experi- 
mentally. We are at present far from knowing enough about 
individual and species differences (XVII) to speak with any 
confidence on this matter from the theoretical point of view. How- 
ever, the organism’s performance of the sequence HJL, particular y 
at points H and J, may vary greatly; it may range from a smooth 
(rapid) unified act to a very slow and halting scries of acts, depend- 



10. The Problem-Solving Assembly 
of Behavior Segments 


In our progressive analysis of adaptive behavior we shall now 
consider the concrete problem-solving behavior of non-speaking 
mammalian organisms. We have seen how chance variation (sOb) 
in cornbination with reinforcement and experimental extinction 
gives rise to trial-and-error behavior, and how trial and error in 
turn gives rise to behavior chains (pp. 156 ff.). With the exception 
of the conditioned reflex in a pure form, all types of behavior 
w 1 C me late learning constitute problems for the organism, in 
one way or another. The problem consists in securing food, or a 
or in avoiding nocuous stimuli, and so on. 

Moreover, these behaviors normally display a kind of direction, 
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albino rat, for example) first thoroughly forms the locomotor 
habit JKL with a very large food reward at L; then on a different 
occasion, say 24 hours later, it forms the locomotor habit HiIJ 
with a comparatively small food reward; and finally, on a third 
occasion, it forms the locomotor habit H 2 MN for a similar small 
food reward. After these three habits are formed the animal is 
placed at H and observation is made as to whether it goes to H 2 MN 
or to HiIJKL. Insight, of course, would lead to the choice of 
HiIJKL and the much larger reward (K') at L, as compared with 
H 2 MN. 

Secondly, it must be noted that in spite of the fact that the 
locomotor habits HiIJ and JKL were formed independently, they 
have box J in common. But this common box J makes possible 
the functional or dynamic junction of the two habit segments by 
means of the two fractional antedating goal reactions (2). The 
fractional goal response rG«, (in Figure 73) first moves from L 
back toward J and then is evoked by J itself. Then when habit 
HiIJ is formed this ro,,, now attached to J, becomes a part of J 
and is brought forward to path Hi. Thus a functional connection 
IS established between the two related habit segments, and becomes 
the basis of their subsequent unity. 

Some of the theoretical details of this process are given by the 
three S — > R diagrams of Figure 73. Diagram I shows the antedat- 
ing goal reaction at L, (rG„), the two e’s of the subscript indicating 
the very large reward. Diagram II shows the same tendency for the 
antedating goal reaction to come forward in series HiIJ. But since 
both ro, (indicating the small reward) and ro„ are already at J 
by the antedating tendency of series JKL, there are here two 
antedating reactions — one leading to J and one ultimately leading 
to L. Finally, Diagram III shows the stimulus-response sequence 
set up on the less adaptive locomotor series H 2 MN. 

And now we come to the test for the presence of insight. The 
animal, 24 hours hungry, is placed at H and allowed to choose. 
^Vhich bonds lead toward MN, i.e., to H 2 and a small amount of 
food, and which to IJKL, i.e., to Hi and several times as much 
food? Consider the learned reaction tendencies leading to Rn, and 
to Rn, in theoretical Diagrams II and III respectively. An inspec- 
tion will show that in II five bonds lead to Rh, and in III only four 
lead to Kny All the bonds present in III arc presumably the same 
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ing upon the capacity of the organism in question to join inde- 
pendently acquired behavior segments into novel wholes. This 
implies that the well unified type of response combination will be 
comparatively strong; i.e., that the chance of the HJL choice will 
be around 100 per cent, and that of the HN choice will be near zero 
per cent. Such choices would be easy enough to distinguish cither 
statistically or by inspection. But in the case of a feeble but genuine 
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theoretical expectation that Rh, > Rhj. Such behavior is regarded 
as a rather simple form of concrete insight. 

Generalizing on the preceding considerations, we arrive at our 
next theorem: 

THEOREM 123. IJ two separately formed spatial locomotor habit 
segments possessed by an organism would yields when operating con- 
secutively^ a major goal^ there would be a distinct tendency for them 
to so operate on the first occasion that offers. 

A second point at which the insightful mechanism of the antedat- 
ing goal reaction stimulus (sg,^) operates (Figure 72) is at box J. 
From H to I and from I to J there are connecting reinforced traces, 
as there are from J to K and from K to L. But in the case considered 
above there can be no directly reinforced perseverative trace con- 
nection between I and K through J. The presence of such con- 
tiguously reinforced stimulus traces in a control group of organisms 
trained with reinforcement to go directly from H to L would imply 
that the locomotor process through box J will be somewhat slower 
in a group being tested for insight. 

Generalizing on the preceding considerations, we arrive at our 
next theorem: 

theorem 124. On the first execution of an insightful behavior 
sequence involving two newly assembled behavior segmentSj there will 
be a longer latency at the junction point than in a control situation in 
which the locomotor sequence has previously been complete and reinforced. 

However, once the animal has made a few rewarded runs from 
Ht to L, the rewarded stimulus-trace connections will be added to 
the insightful connection, making the response latency at point J 
approach that of a control experiment in which the subjects have 
received rewarded runs continuously from Hi to L from the 
beginning of training (5, p. 232). 

Generalizing from this and the preceding considerations, wc 
arrive at our next theorem: 

theorem 125 . /n the course of a few normally reivarded response 
evocations of the insightfully joined habit, the reaction latency at the 
point of junction will be progressively shortened. 
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as the comparable bonds in II. This would imply that the advantage 
of Rh, is due to the additional presence of So„. There is of course 
the presumptive presence (not shown in the diagram) of the stimuli 
arising from the distinctive floors of the boxes, especially L, and 
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tend to reinforce that activity. But the animal’s spontaneous ex- 
plorational passage without food reward from H 2 to N will not 
produce this secondary reinforcement. It therefore follows that 
under these circumstances if later the hungry animal is placed at 
H and allowed a free choice, it will weakly tend to choose Hi 
rather than H 2 . This would be a case of feeble insightful behavior. 

Generalizing on the preceding considerations, we arrive at our 
next theorem: 

THEOREM 128. IJ the final posterior habit segment oj a potentially 
insightful behavior combination has alone been well rewarded and then 
the animal is permitted to run freely in the remaining two regions, a 
suitable test given a short time later will reveal a weak tendency to in- 
sightful behavior. 

Consider an additional change in the two supposed forms of 
behavior discussed above. Let it be supposed that after the two 
types of preliminary training, but just before the test for insight, 
the animal is placed at L and allowed to eat a little of the food 
there. This is known as prefeeding. The perseverative after-effects 
of this food consumption duplicates to a considerable degree the 
original feeding at L, making it much stronger than the anticipatory 
normally transferred to H by the stimulus trace generalization. 
From this it follows that this stronger re,, (despite some loss from 
stimulus-intensity generalization) should yield a stronger difference 
in favor of HiJ over H 2 N. 

Generalizing on these considerations, we arrive at our next 
theorem: 

theorem 129. 7n the case of a situation favoring spatial locomotor 
insight, a small prefeeding at the goal will increase the probability of 
an animaVs insightful behavior. 

The preceding considerations regarding insight have been based 
on two habit segments. We must now take up the question of 
whether three previously independent habit segments will spon- 
taneously integrate themselves in such a way as to evoke concrete 
problem-solving behavior in an organism such as an albino rat. 

Following the derivation of Theorem 123, the antedating oacc 
anism there elaborated presumably could be extended to three 
habit segments as follows. The terminal segment including ro„ 
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At this point we may make an observation which has an im- 
portant bearing on the relationship between insightful behavior 
and trial-and-error learning. In case the insight mechanism is not 
very strong, the bOr principle may occasionally override it, though 
not usually. This would yield a minority of the trials for insight 
occurring without reinforcement, though the majority of the 
responses would be reinforced. But this choice element is the sub- 
stance of trial-and-error learning. These trial-and-error responses 
will naturally supply the reinforced stimulus traces originally 
lacking in strength in the insight mechanism. Thus, while the two 
processes may be contrasted and must be distinguished, they 
evidently supplement each other as indicated in the above theorem. 

This consideration leads us to our next theorem: 


THEOREM 126. Reinforced trial-and-error learnings following 
locomotor insightful bekavioTy lends to rnake good the natural weakness 
at the junction of the two segments. 

Let us suppose a slightly different experimental arrangement, 
one in which an animal, after many distributed rewarded runs 
horn J to L IS permitted to live freely in the joined sections of the 
““dating reaction R„.., the 
its exnln™^ e° J than at other places, once 

Sl Cd t in""" experimental 

This leads to our next theorem: 

insiphtfti^^^ posterior habit segment only of a potential 
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from the antedating goal reaction p secondary reinforcement 
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preceding analysis, it would seem that molar insight is mainly 
dependent upon the organism’s capacity to transfer antedating goal 
reactions from one temporal situation to another. Then of course 
there is the problem of the assembly of the habit segments once 
they have been strung together on the thread of Tq^. A little such 
speculation shows quite clearly that the present type of analysis 
demands a far more minute quantitative molar knowledge of 
behavior than we have at present. 

Turning now to the question of the empirical validity of Theo- 
rems 123 to 130, we must say that despite the fact that two experi- 
mental studies have been published on the subject, no evidence 
bearing exactly on any of our theorems has been found. In a 
study (5) which very roughly approached the conditions of Theo- 
rem 123 but which employed too few subjects for conclusive results, 
Maier claimed to have attained positive evidence of concrete 
spatial locomotor insight in albino rats; the statistical reliability 
must have been utterly unsatisfactory. 

One of three experiments (experiment #2) by Wolfe and Spragg 
(70), who used many more animals and report a closer approxi- 
mation to the conditions of our Theorem 128, yields some evidence 
rather favoring concrete spatial locomotor insight in albino rats. 
However, the statistical reliability of these results also was not 
satisfactory. 

No experiments bearing even remotely on Theorems 124, 125, 
126, 127, 129, and 130 have been found. For this reason these latter 
theorems have the status of genuine theoretical predictions. 

Spontaneous Tool-Use Acquisition 

The major portion of what is now to be presented concerns insight- 
ful behavior in relation to the acquisition of the use of instruments 
or tools. This subject is inserted here because many have assumed 
that tool-using behavior involves insight (7; 3; 8). It accordingly 
becomes necessary to give an elementary account of one form of 
simple tool-use acquisition, though the case we shall consider here 
involves a problem of prime importance in its own right. 
of tools is so natural and universal with humans that we arc likely 
to pass this problem over without a thought, or to consider it tw 
unimportant to merit serious consideration. A greater mistake 
could scarcely be made. 
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would be formed as before. The middle segment would overlap* 
the anterior end box of the terminal segment. This would transfer 
ro„ to the anterior end of the middle segment. The posterior end 
of the initial segment would overlap the anterior box of the middle 
segment. This in turn would transfer the ra,* to the anterior end 
of the anterior segment, thus completing the very tenuous but 
presumably genuine insightful integration of three habit segments 
in a concrete problem situation. 

Generalizing on the preceding considerations, we arrive at our 
next theorem: 


THEOREM 130. Three or more independent habit segments may be 
assembled in orderly sequence Jot problem solution, but the mechanism 
will be progressively more tenuous than will that oj the assembly of two 
habit segments for the same function. 

At this point the reader must recall the principle of the spatial 
habit-family hierarchy (Chapter 8). This states in effect that if a 
goal is located at any point in free space, all conceivable paths from 
an organism’s position at the time to the goal point tend to become 
ih.'™", reaction potentials, their strength being the reverse of 
hier. 'hings equal. So far as the habit-family 

with H 72 starting 

to HjMN, and both would be weaker than a 
Hill and H ^ ^ follows from this that both 

sidLd ah^ T tendencies to action when con- 

following geneml°prindpkr“‘‘"® considerations, we arrive at the 

ccccmpmid fynLlT nZoTZlh 
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mainly influenfeUn CTM^'g^hraiff “ 

possessing a high level of insLht and “ 
insight. It must be confessed that ^ u P°f a low level of 
molar mechanisms underlying i„d°“d concerning the 

begun to devclon ^ ^ tlividual differences has hardly 

fnmre, a few stL^^® T"''* " development in the 
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significant that the drives in play are very mild and that the goals 
are equally mild. Both processes appear to be largely concerned with 
manipulation. Now it happens that with this casual manipulation, 
the stick was simultaneously associated with the instrumental use of 
the hand which held the stick in reaching toward objects. In this 
connection Birch remarks (7, pp. 374, 375): 

However, in the course of this reaching, the animals, after 
having established manual contact with the object, sometimes 
poked at and touched it with the stick. . . . The object was 
not reached for with the stick but with the hand with the stick 
appearing to play no functional role in the reaching-contacting 
pattern, , . . [At a later] observation period of the first 
day the animals were seen to reach out and touch distant 
objects, usually another of the chimpanzees, with the stick. 
By the end of the first day during which the sticks were avail- 
able to them, every one of the subjects had on several occasions 
used the stick as a functional extension of the arm. That is, 
they had all been observed to reach out with the stick and touch 
some animal or object distant from themselves. . . . During 
the second [day of stick play] the animals were observed to be 
using the stick more frequently as an arm extension, and several 
times fights were started when one chimpanzee poked another 
sharply with a stick. 

Thus we appear here to have an account of the shift from manipulative 
play with the stick to instrumental use of the stick, a change of the greatest 
theoretical importance. At the end of this period presumably a 
considerable variety of goals have become fairly well attached to 
stick behavior. Let us consider how this could occur. 

It must be recalled in this connection that the basic mechanism 
underlying response generalization (4, pp. 316, 319) is found in 
bOe deviations which are relatively small individually, and that 
these deviations which chance to move in a fortunate direction, 

* in a direction favorable to the manual goal dominant at the 
uioment, will be reinforced. In short, this response generalization 
will consist of a kind of trial-and-error learning. The reinforcement 
'vill serve as a new basis for further oscillatory deviations (xm, A 
and B) which also will be reinforced; and so on. The 
observations show that once the instrumental use of the sdcks had 
begun, the extension of this use, while definitely gradual, was rapid. 



31B 


A BEHAVIOR SYSTEM 


The organism whose behavior we shall examine is the chim- 
panzee, and the instrumental use to be acquired by this superior 
animal is the power of intelligently reaching beyond the arm’s 
length. Consider, then, a young chimpanzee which in spontaneously 
attaining certain goals is obliged to perform various manipulations 
of its environment, such as grasping a banana in its hand. But sup- 
pose that the banana lies a little out of arm’s reach, through the 
bars of the chimpanzee’s cage. If the hoe-like stick shown in 
Figure 74 were present, it would be used by a human in such a 
situation as a tool to drag the banana near enough to be reached 
by the hand. But apparently the chimpanzee will not do this until 
it has first learned the instrumental use of the stick. The point is 
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and arm habits will lead the animal to substitute the stick as an exten- 
sion of the arm for the manipulation of distant objects. 

Theory of the Insightful Assembly of Tool-Using Behavior Segments 

Before giving the chimpanzees the sticks to play with, both Jackson, 
with one animal (5), and Birch, with five animals (1), made a 
thorough test to see whether they would use a stick-like tool in an 
“insightful” way to drag in food which was out of reach. The tool 
in both cases was a stick with a piece across its end making a kind of 
hoe, as represented in Figure 74. These tests definitely failed to 
yield insightful behavior. The several hours of free play with simple 
straight sticks then followed as just described. After the acquisition 
of the basic instrument-using skill or habit segment, the test for 
insightful behavior was repeated as given in the first place. All the 
animals grasped the stick-like tool either at once or almost at once 
and pulled in the food, picked it up in the hand, and ate it. In 
other words, after the establishment of the habit segment of length- 
ening the reach of the arm and hand by means of a straight stick, 
the junction of the two habit segments took place and the out-of- 
reach food was promptly secured and eaten. How shall we explain 
the occurrence of this supposedly insightful behavior? 

We have already considered the relatively simple case of loco- 
motor insight as based on the conditions of Figure 72. Let us now 
try the same analogy on this case of supposed tool-using insight. 
Control path HzMN corresponds to the futile stretching of the 
hand and arm toward the out-of-reach food. As a matter of fact, in 
reporting the unsuccessful trials for insightful behavior preceding 
the learning of the basic instrumental use of the stick, Birch states 
that each of the animals futilely reached out toward the food many 
times (7, p. 373); this shows that reaching corresponded rather 
Well in fact to path HjMN as a neutral control habit segment. 
Similarly, path JKL quite evidently corresponds closely to picking 
the food, earring it to the mouth, and eating. To complete the 
analogy in detail we must say that path HiIJ corresponds to grasp- 
ing the tool, to extending it a little beyond the food, and to dragging 
the food nearer by means of the tool. But this analogy is not so 
'^iose. One point in the theoretical analogy which is here laclang 
from our original derivation is that in the space insight problem 
^he end of segment HjIJ terminated with food, whereas the sttck-plaj? e- 
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Two factors in this stick play should be carefully noticed here. 
The first is that when casual manipulative behavior receives even 
feeble reinforcement, the reinforced acts become goal reactions or 
subgoal reactions with the accompanying tendency to the fractional 
antedating of the original occurrence, the resulting guiding stimuli, 
and so on. 

The second factor to be noticed in connection with this stick 
play is that the movements, especially as seen, and the associated 
cutaneous stimuli are rather similar to those involved in touching 


objects with the fingers. The fingers are seen as an extension of the 
hand and arm, just as is the stick when it is held in the hand; when 
the hand is on its way to touch an object the visual distance separat- 
ing the two decreases continuously, just as it does when by chance 
the stick held in the hand moves toward a touchable object; more- 
over, the resulting change in manual pressure coming indirectly 
through the stick resembles somewhat the pressure stimuli coming 
from a touch with the fingers. Also intertwined with the touch 
stimuli will be the touch anticipatory goal reactions mentioned as 
our first factor, coupled with their similarity to the strong manual 
subgoal reactions, especially when the latter are first frustrated by 
the fact of the touchable objects being a little out of reach. 

It is believed that the similarity of these stick-produced stimuli 
to manual touch-produced stimuli will tend, through stimulus 
generalization, to evoke under certain favorable manipulative 
circumstances the arm movements normally associated with hand 
niovcmcnts leading to touching. The occasional success of such 
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Jalse one of which (RoO has been weakened by experimental extinction 
to the local stimuli {increased size of retinal images, greater binocular 
convergence tension, and so on) below the other, the latter being the 
true, tooUusing one (Rg»)* 

But since, by Theorem 132 B, sEHa* > BErtQ» it follows that Ro' 
will be evoked rather than Ro*. 

Generalizing on this we arrive at part C of Theorem 132: 

THEOREM 132 G. The successful subgoal Ra» ultimately will be 
evoked rather than the unsuccessful subgoal Ro'. 

But it must be remembered that the critical stimuli involved in 
reaching with the tool are those coming from the stick-like handle 
(Figure 74). Now the stick-hoe as a whole is a little different from 
the play-sticks, but its handle is similar to them, especially at the 
near end. By the principle of qualitative stimulus generalization 
(XA) this stimulus similarity will be sufficient to evoke a grasping 
of the handle of the stick-hoe. 

Generalizing on these considerations we arrive at part D of 
Theorem 132: 


theorem 132 D. The similarity of the new tool is close enough 
to evoke the first part of the dominant subgoal response (Ro')> ^hat of 
grasping. 

But after the grasping of the handle of the hoe there remains 
the task of using the hoe to drag in the food. This is probably the 
most critical part of the act of insight. The stick cannot grasp the 
food as a really elongated hand might do. In this connection it 
appears that the first act in a novel situation is to touch the goal 
object. The animal has learned to do this from its stick play. 

This leads us to part E of Theorem 132: 

theorem 132 E. Once the tool is grasped, it is first used to touch 
the goal object. 


The position of the hoe in Figure 74 shows that 
first touching the food with the hoe the organism moved it sUgn V 
nearer. Reports of chimpanzee behavior show that t cy 
tjuick to be reinforced by a favorable direction of s ig t move 
of the goal object. This reinforced reaction will be genera iz 
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havior had never been associated mtkfood. How shall this discrepancy be 
met? The recast deduction involves several steps as follows: 

It will be recalled by consulting sequence I of Figure 73 that 
even though the subgoal reaction (Rl) precedes the goal reaction 
(Ro) in a behavior sequence, the fractional antedating goal reaction 
(ro) as a stimulus (so) actually precedes the subgoal reaction. It 
accordingly follows that the antedating goal stimulus (so) is rein- 
forced to subgoal response (Rl). From this it follows that when an 
anticipatory goal reaction is initiated by the sight of food, even 
though at a little distance beyond the cage bars, the various subgoal 
responses of the original series tend to be evoked more or less in 
their usual sequence. 

Generalizing on these considerations we arrive at part A of our 
next theorem: 

THEOREM 132. In the acquisition oj simple tool’Using behavior by 
chimpanzees: 

A. The fractional anticipatory goal stimuli (so) are reinforced to all 
the subgoal responses (Ro») of the series by the reward received at the 
end of the behavior series. 

It happens that in the tool-using situation here under considera- 
tion the animals had reached through the bars for the food without 
success (Ra) many times. This would naturally set up appreciable 
amounts of inhibition which would tend to become conditioned to 
the accompanying stimuli. On the other hand, in the (subgoal) 
play activity with the sticks which involved stretching to distances 
beyond reach of the hand, touching, poking, and so on, Rq' had 
been uniformly successful. Here, therefore, we have a situation 
resembling that of trial and error in which a given set of stimuli 
is connected in such a way as to evoke two incompatible responses, 
one of which (reaching with the hand) has been weakened, but not 
abolished, by experimental extinction, and the other of which 
(reaching with the stick) even though originally considerably 
weaker, is now dominant. 

Generalizing from these considerations we arrive at part B of 
Theorem 1^2* ^ 


THEOREM 132 B. The antedating goal stimulus (so) released by 
the sight of food at a distance tends to evoke two subgoal responses, the 
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false one of which (RoO has been weakened by experimental extinction 
to the local stimuli {increased size of retinal imageSj greater binocular 
convergence tensiouj and so on) below the other, the latter being the 
true, tool-using one (Ro*)* 


But since, by Theorem 132 B, > bErq. it follows that Ro' 
will be evoked rather than Ra'- 

Generalizing on this we arrive at part C of Theorem 132: 


THEOREM 132 G. The successful subgoal Ro' ultimately will be 
evoked rather than the unsuccessful subgoal Ro'- 

But it must be remembered that the critical stimuli involved in 
reaching with the tool are those coming from the stick-like handle 
(Figure 74). Now the stick-hoe as a whole is a little different from 
the play-sticks, but its handle is similar to them, especially at the 
near end. By the principle of qualitative stimulus generalization 
(XA) this stimulus similarity will be sufficient to evoke a grasping 
of the handle of the stick-hoe. ^ _ - 

Generalizing on these considerations we arrive at part o 
Theorem 132: 


theorem 132 D. The simUarity of the new tool is dose enough 
to evoke the first part of the dominant subgoal response (Ko-;, J 
grasping. 


But after the grasping of the handle of the hoe 
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opposition to the doctrine be sharply distinguished one from the 
other. This can be done by dealing with the two distinct uses to 
which the term, insight^ has been put, the one categorical, the other 
explanatory.^’ Hitherto the major achievements associated with 
the term insight, have been categorical. Maier advanced the 
problem as a whole by a useful bit of analysis. He described the 
phenomenon of insight as “the ability to bring together spontane- 
ously two elements of past experience without having them previ- 
ously associated by contiguity” (5, p. 46), the elements in question 
being habit segments. This advances the categorizing factor, or 
identification, toward the explanatory. This chapter has been con- 
cerned primarily with an attempt to understand how the phenome- 
non of insight comes about; i.e., it has been concerned with the 
explanatory aspect of the problem. 

Briefly stated, the novel behavior mechanism which is mainly 
instrumental in the unique behavior displayed in what wc have 
called insight, is the antedating goal reaction which is character- 
istic of behavior segments. It is this identity of fractional antedating 
elements which bridges the gap left by the lack of associative 
contiguity mentioned by Maier. Thus wc find ourselves reverting 
in a sense to association by similarity, proposed by William James 
(^0 in his attempt to explain rationality some forty years ago. 

In the case of tool-using insight the present analysis finds the 
central factor to be the antedating goal reaction together with the 
subgoal reaction, combined in a complex way with arm- and hand- 
reaching extension. This analysis greatly needs to be followed by a 
series of carefully controlled experiments on stick play which will 
reveal in some detail the manner in which the mechanism o 
response generalization operates when the animals shift from one 
goal-response with the stick to another. It is strongly suspected 
*hat the mechanism which we have described as insight in the use 
of the stick-hoe also operates when the simple stick is used in p ay. 


Refe 


RENCES 


Birch, H. G. The relation of previous experience insightful 
problem-solving, J. Comp. P^'chol.j 1945, 38, 367 • 

2- Hull, G, L. Goal attraction and directing ideas conceive . 
habit phenomena. P^'chol. 1931, 38, 487 



A BEHAVIOR SYSTEM 


324 

Other appropriate muscular activities by the principle of response 
generalization (xiii A and B), which will promptly bring the food 
within reach. 

From the preceding considerations we arrive at part F of Theo- 
rem 132; 


THEOREM 132 F. Secondary remjorcement and response generaliza^ 
tion of the acts which gave an approaching movement to the goal 
object will rapidly lead to the dragging Jorward of the object and the 
reinforcement of the act as a whole. 

Theorem 1 32 F appears to complete the deductions of the insight- 
ful use of a simple tool. This is to say that after the stick play the 
chimpanzees, following a slight delay, would take the equivalent 
of the joint behavior segments HiIJKL rather than the unrewarding 
segment HjMN (Figure 72). No strictly unique principle has been 
used in the theoretical derivation of this process. It is true that a 
small element of trial-and-error learning was assumed, but many 
other factors were also assumed in the deduction. Moreover, the 
stick play resulted in much learning which had an element of trial- 
and-error throughout. Indeed it seems likely that a meticulous 
series of carefully controlled experiments on this problem will re- 
veal substantially the same elements of insight in the progressive 
Stic play as we have described in our deduction of the several 
parts of Theorem 132. 

In addition to using sticks as tools, there is much evidence that 
anthropoids throw objects as missiles; this again is a kind of exten- 
sion of the arm and hand, but here contact is withdrawn. Un- 
experiments have been performed 
with I earning. On the basis of the discussion concerned 

Td^e of it i^ to be expected that a knowl- 
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1 1 . Value, Valuation, and Behavior Theory' 


Can Value and Valuation Be Treated Ob|ectIvely? 

As our final view of individual behavior in its quasi-social aspects, 
we shall consider some of the phenomena and problems associated 
with the theory of value and valuation. Actually we have been 
dealing with the behavioral substance of value theory throughout 
all the preceding chapters. We must now treat it specifically. 

The relationship of valuation to behavior theory can be clarified 
by a concrete example. Consider an ordinary apple. Such an 
object may be approached from many different scientific angles. 
Physics may treat of the light refiected from its surface or consider 
its weight and density; chemistry may discuss the constitution of 
hs juice; botany may present its relationship to other plant species, 
plant physiology may report its processes of growth and reproduc- 
tion. This list could be extended almost indefinitely. But in addition 
to these types of approach there is another of a somewhat different 
oature; this lies in the fact that the apple has a market price, i.e., 
It has value. 

Value theory has a long history, much of it complicated by 
subjectivism. In illustration of this, let us examine briefly the kind 
of theoretical tangle which the injection of metaphysical presuppo- 
sitions into the subject will produce. This may be clearly seen ^ ^ 
following extract from Robbins {11, pp. 87-90), who believes that 
value and valuation are quite beyond the powers of an objcctiv 
scientific methodology such as that already put forward. 

* This chapter is based to some extent on an article by the author (5), several p 
p-aphs of vyhich have been transcribed with litUc or no change. 
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1 1 . Value, Valuation, and Behavior Theory^ 


Can Value and Valuation Be Treated Objectively? 

As our final view of individual behavior in its quasi-social aspects, 
we shall consider some of the phenomena and problems associated 
with the theory of value and valuation. Actually we have been 
dealing with the behavioral substance of value theory throughout 
all the preceding chapters. We must now treat it specifically. 

The relationship of valuation to behavior theory can be clarified 
by a concrete example. Consider an ordinary apple. Such an 
object may be approached from many different scientific angles. 
Physics may treat of the light reflected from its surface or consider 
its weight and density; chemistry may discuss the constitution of 
its juice; botany may present its relationship to other plant species, 
plant physiology may report its processes of growth and reproduc- 
tion, This list could be extended almost indefinitely. But in addition 
to these types of approach there is another of a somewhat different 
nature; this lies in the fact that the apple has a market price, i.c., 
it has value. 

Value theory has a long history, much of it complicated by 
subjectivism. In illustration of this, let us examine briefly the kind 
of theoretical tangle which the injection of metaphysical presuppo- 
sitions into the subject will produce. This may be clearly seen *o t c 
following extract from Robbins (11, pp- 87-90). who believes t a 
'’alue and valuation arc quite beyond the powers of an objccti 
soicntific methodology such as that already put fortvar 

chapter b bajcd toiomc c.ncnt on an article by Ihc author (5), .ccral p 
P.pht or which have been transcribed with lilUc or no change. 
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will Strive to secure and eat it. Conversely, having smelled and 
nibbled at a novel kind of nourishing food, these organisms will 
come to value it, as is shown by their striving for it when hungry. 
In short, value represents the potentiality of action. But action 
potentiality in this system is represented by sEn. And the presence of 
sEr serves to introduce the whole series of factors upon which it 
depends, as demonstrated by equation 8, 

sEr - D X Vi X K X bHr, 

together with their determining circumstances. We have selected 
from the many possible forms of value the above example related 
to food needs because of its general familiarity to the reader, its 
simplicity, and its comparative lack of political and metaphysical 
bias. 


The Paradox of the Locus of Value 

The quantitative systematization of the theory of value and 
valuative behavior enables us to resolve certain paradoxes which 
have commonly been associated with “theories” of value. A dis- 
cussion of some of these besides being of interest in its own right 
tnay have the further merit of introducing the reader to the natural- 
science approach to this important set of phenomena. 

One of the standard problems of this type concerns the essential 
locus of economic value— whether it lies in the valued object or in 
the Valuing organism. In a certain sense the question is a false one 
In that it implies that the locus must be exclusively in one or the 
other. It is a little like asking whether the momentum of a falling 
object is due primarily to its mass or to the time it has been falling, 
the fact is that a knowledge of both is indispensable for the determi- 
nation. The habit strength (sHk) resides in the state of the nervous 
system of the organism. This in turn results from a certain historica 
i“elationship between the organism and the object, situation, or 
state of affairs which has value (K') such that the former 
earned through reinforcement to strive for the latter. ^ 
a special sense value, as distinguished from valuation, 
to lie in that characteristic (K') of the substance or commodi y 
'vhich makes it a reinforcing agent to that organism, but it i 
equally true that the reinforcement process depends upo 
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especially his total lack of comprehension of the role of symbolic 
constructs in natural-science theory: 

Scientific method, it is urged, demands that we should leave 
out of account anything which is incapable of direct observa- 
tion . . . Valuation is a subjective process. We cannot observe 
valuation. It is therefore out of place in a scientific explanation. 
Our theoretical constructions must assume observable data. 
. . . [This] is an attitude which is very frequent among those 
economists who have come under the influence of Behaviourist 
psychology or who are terrified of attack from exponents of 
this queer cult. 


. . . The argument that we should do nothing that is not 
done in the physical sciences is very seductive. But ... it is 
very questionable whether this can be done in terms which 
involve no psychical elements. . . . The idea of an end, which 
is fundamental to our conception of the economic, is not pos- 
sible to define in terms of external behaviour only. If we are to 
explain the relationships which arise from the existence of a 
scarcity of means in relation to a multiplicity of ends, surely 
at least one-half of the equation, as it were, must be psychical 
in c aracter, . . . But ... the procedure of the social sci- 
ences which deal with conduct, which is in some sense pur- 
posive, can never be completely assimilated to the procedure 
0 i e p ysical sciences. It is really not possible to understand 
the concepts of choice, of the relationship of means and ends, 
. ^ of our science, in terms of observation of 

involv^r ^ ^ l of purposive conduct . . • 

psychic ro^physical''!^!" explanation which are 

should' have mtlt^diffictiUv chapters, the reader 

with the subject of valt a H “-e present approach 

that vatmtim is at bottom 

capable of the .a a aspect of behavior and in so far is 

ha^e *sl^d inTh -«™ent as we 

of the basic asnm. chapters. This means that many 

object are shar^ by°th7hwIranimTV'''”'‘‘”' 
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illustrates at a coarse level the behavioral fact so prominent in the 
economic theory of price deterrrunation, namely, that individual 
men value certain goods more than other goods, and that these 
valuations in fact display a hierarchy from high to low valuation. 
This commonplace observation that 


arises because, 


bCr, > sErj 
bHr, > sHa„ 


or. 


Di > Da, 


K, > Ka. 

This has been believed in certain quarters to give rise to problems, 
such as those of choice, which are quite insoluble by the method- 
ology available to natural science. 

The supposed problem of how and why the striving potentials 
for, or evaluations of, different objects, substances, or states of 
affairs displayed by a given organism vary from time to time is 
resolved quite simply by the present approach. This has been 
incidentally elaborated above in Chapter 2. If the two stimulating 
Situations are presented simultaneously or in close succession in 
such a way that the acts of striving for one preclude the simultane- 
ous performance of the acts involved in striving for the other, there 
arises a competition within the body of the organism and that 
reaction potential which is momentarily greater mediates the 
corresponding reaction (xiv). The laws governing the resolution 
of the competition of two reaction potentials are different from 
*ovolved in the dominance of the heavier of two weights on 
^ e pans of a balance, but the outcome is closely analogous; in 
j ^ case of behavior competition the balance is the organism itself. 

0 the one case the process is no less naturalistic than in the other, 
Sreater metaphysical mystery surrounds it. It is indisput- 
® *C) of course, that the theoretical determination of the outcome 
® a choice situation is more complex than is the question of the 
dominance of the heavier of two weights on a balance, but the 
^^bor involved in the theoretical determination is not the matter at 
»«uc here. 
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characteristics of the organism; hay will reduce the food need of an 
ox but not of a man. A substance will not be valued (striven for) 
by an organism until the process of reinforcement has occurred, 
i.e., until bEr exists. 

But suppose a substance having no power of need reduction 
chances to have a pattern of stimulation much like the substance 
which previously has reduced a need. This stimulus will, through 
stimulus generalization (X), evoke striving activity. Does this 
mean that the second substance has value for the organism? To 
assert this would be something like a play on words. One could 
properly say, however, that the organism values this second object 
or substance but that the object or substance in question has no 
value for the organism. The latter is demonstrated behaviorally 
by the fact that after striving reactions have been evoked a few 
times by the falsely valued object, experimental extinction (IX) 
will supervene and this particular stimulus complex will no longer 
evoke striving (ix, x, and xi). 

A paradox not quite so easily resolved is that in which the striving 
as been generated through secondary reinforcement; i.e., where 
of “P ‘hrough the action not of a state 

forrin actually reduces a drive, but of a secondary rein- 

chimmnf ■ t furnished by Wolfe’s 

that mirbT , ^ wituth would work for, and treasure, poker chips 
then alwa “ “ nould be inserted in a slot machine which would 
approach ^ grape for each chip. Here, of course, we 

money value. VnM thTrt' f 
may nronerlv k -j ^^'^^Panzee strives for the poker chip, it 

has no capachym reducf "“P 

said to have no ■ f • • ^ P^'iniary need and therefore it may be 
means to the securinir^of value. But since it is an indispensable 
for food, the poker which do reduce the primary need 

possesses secondary reinforcing powers. 

TKe Np,„rp,.Sd.„„ 

grape when inierted'ln "'thldof “ 

by his chimpanzees to a type 
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p. 327). Consistency is attained only by a very large group of 
organisms taken as a whole or by single organisms when very many 
trials are massed. It is only in this latter situation that Robbins’ 
proposed consistency is attained. In economics it is attained through 
the pooled action of many individuals. 

There is another type of apparent inconsistency in valuative 
choice behavior; the nature of this is well recognized by economists, 
among whom it is known as the Law of Diminishing Marginal 
Utility (77, pp, 136 ff.). This law formulates the familiar fact of 
satiation, that the more an organism has of a given reinforcing 
substance or commodity the less it will strive for an additional 
increment. In systematic behavior theory this general subject is 
called motivation (4, p. 226). A recent experimental study by Perin 
(5) indicates that, other things equal, the valuation, or K' which 
an organism places upon a bit of food is the product of an increasing 
function of the need or drive for food (D) multiplied by habit 
strength, sHe (VIII). The product (sEr) is bound to rise or fall 
^ D rises or falls. Recently both these functions have received 
preliminary empirical determination. Because of the oscillation of 
bHe and its multiplication by D, the momentary variability of 
valuative behavior (bEr) in no sense indicates a lack of la\vfulness 
in the primary behavior principles, since the oscillation function 
(bOr) itself is lawful (XII). 

Still a third source of what appears superficially to be behavioral 
inconsistency or capriciousness in the sense of the lack of the opera- 
tion of natural law in valuative behavior is brought about by the 


differences in the histories of the valuing organisms. If one organ- 
ism in the past has had its food needs reduced exclusively by a diet 
presenting certain stimulus characteristics, and another has had 
the corresponding needs reduced exclusively by a diet presenting 
distinctly different stimulus characteristics, each organism 
strive with maximum vigor for its accustomed food and not for the 
food of the other. This in no sense implies a breakdown of primary 
natural-science dynamic laws. It is true that the natural laws 
involved are not the laws of Newtonian mechanics. It is also troc 
that behavior laws, owing to the principle or law of the oscillation 
ot reaction potential (bOr), arc molar in the sense that they hoia 
strictly only for central tendencies calculated from 
^niplcs of carefully measured data. Nevertheless, in the mo 
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Why Do Organisms Value the Same State of Affairs Differently on Different 
Occasions? 

The question of the consistency and inconsistency of an organism 
in its valuation of various states of affairs is a matter of some im- 
portance to the logical foundations of value theory. For example, 


Robbins remarks (77, pp. 91-92): 

The celebrated generalization that in a state of equilibrium 
the relative significance of divisible commodities is equal to 
their price, does involve the assumption that each final choice 
is consistent with every other, in the sense that if I prefer A to B 
and B to C, I also prefer A to C . . . 

From the point of view of the present approach, the consistency 
of organisms in making evaluative choices is not necessarily a 


syllogistic matter, as might possibly be supposed from the above 
quotation, since it is displayed by subhuman organisms which 
presumably do not syllogize. To syllogize involves the use of words 
or equivalent symbols (pure-stimulus acts), whereas subhuman 
animals do not employ language in any proper sense. Behavioral 
inconsistency in evaluative choices of both humans and lower 
animals is believed instead to be a function of the spontaneous 
oscillation of the reaction potential (sOr), (XII). Where two reac- 
tion potentials of equal strength are in competition, as in simple 
discrimination situations, each appears to dominate on fifty per 
cent of the occasions (instead of neither one occurring at alias 
would be the case with a coarse balance); this condition of equal 
therefore yields a maximum of inconsistency in 
the nth potentials becomes stronger than 

B rcactT„n“'-!rK“"'’'^ trial-and-error learning-see Chapter 2). 
r«rectwrrrar ' half the time. But since the 

of the ttvo I 'll oscillate independently the weaker 

to be iH '^•■ance at the moment of stimulation 

^“-lly stronger 

phase (4 n 14A^ t j to be in a low oscillatory 

»mpii;l’d':;Ji:tLe"oft;:r'’''‘'^ 

trials only when the differenT"!!.'^ Potential, will occur on single 

great that the Icw^'otmadorn^rT'" ” 

are exceeded by the highest ^-n^ P°'0"'‘ol no longer 

S oscillations of the weak potential 
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mediate the reduction of specific primary needs. In the more 
sophisticated cultures, some substance which is more easily trans- 
ported than most commodities, and which while usually not cap- 
able of directly mediating the reduction of any primary need is a 
dependable indirect means to a considerable variety of such need 
reductions, usually comes to be employed as a medium of exchange, 
i.e., money. This is possible because money, being an indirect 
means to need reduction, becomes a secondary reinforcing agent (ii). 

According to the present analysis, economics appears to be a 
kind of hybrid science inasmuch as it has its source in the applica- 
tion of a number of different primary or pure sciences. For example, 
the pure-science aspects of psychology or behavior have been 
stressed above, but this by no means covers the whole discipline. 
There are also the scientific aspects of production. In the case of 
agricultural economics there is involved in production the addi- 
tional primary science of plant growth, from which there flows the 
secondary or applied principle of diminishing crop yield of a plot 
of earth per unit of labor employed as the amount of labor spent upon 
it is indefinitely increased. This is the so-called Law of Diminishing 
Returns. 


A second type of value and valuation of great significance is 
found in truth. In one sense a bit of truth may be specified as a 
statement whose symbols accurately correspond to their referents. 
Organisms strive for truth because it constitutes, or contains the 


^eans to, a dependable representation of selected portions of the 
environment. All organisms, particularly those with distance 
■receptors, learn early in life to expose their receptors in such a 
'vay as to receive the most adequate impact of environmental 
stimuli at critical points of behavior sequences. These habits are 
largely organized by compound trial-and-error learning (Chapter 
and maintained by means of the secondary reinforcement 
oased on stimulus traces, the ultimate reinforcement being the 
Soal attainment which in general cannot be achieved without the 
^^posure of the receptors to the environment. For example, in 
a marksman is at a place (or time) such that he cannot make 
^ necessary receptor exposures for the appropriate observations, 
parallel observation made by a second person more favorably 
^'inated can be conveyed to the first by means of language or 
'yinbols. Through the learned equivalence of stimulus patterns. 
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sense just indicated there is strong reason to believe that all be- 
havior, including that of evaluation, displays definite calculatable 
and predictable characteristics provided the habit structure (bHb) 
of the organism, the reinforcing characteristics (K) of the stimulat- 
ing situation, the drive (D), and the stimulus intensity (V) are 
known. Alternatively, prediction may be made from a knowledge 
of the history of the organism and of the immediate stimulating 
situation because, theoretically at least, the characteristics of the 
habit strength may be calculated from a complete knowledge of 
the organism’s history. Thus in the variations of evaluative behavior 
there is no evidence that a genuinely determinate behavioral 
dynamics is lacking. 

Finally, a fourth source of evaluative differences, those between 
organisms with the same history, is due to different constants 
characteristic of different individuals, e.g., in the learning exponent 
(IV). Consequently one organism may follow the law of learning 
just as exactly as the majority even though more rapidly or more 
on the learning parameter possessed by him (d)* 
1 thus be entirely consistent with a general 

lawfulness of valuative behavior. 


The Naterel-Science Slates of Cerlam Classes of Values 

JiTir ‘’.if""® on several different types of 

dogmatiLuy'indiTa'Ieff 

inTOlvef'lhe™‘"f """ obvious and primitive aspects 

of the potentially 

amount of . ^ possessed by person No. 1 for a certain 

brp-o„°No^.'’S “--dity Y possessed 

and continuously only ™ fhe "u" 

striving potcntia'l for^h.. * “"'•■'■on that person No. 1 has a 
than would be his striv “"tount of Y which is greater 

no longer has that port4'’^rx“‘ ‘T “l! ^ 

striving potential for X wl,- ?■ ’ P=™o No. 2 has a 

when ir^o longt fes v r* "" 

prrsms is sidacd by Iks Iranscclicn'" polsntia! of both 

In the case just considered X anrt v ... 
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mediate the reduction of specific primary needs. In the more 
sophisticated cultures, some substance which is more easily trans- 
ported than most commodities, and which while usually not cap- 
able of directly mediating the reduction of any primary need is a 
dependable indirect means to a considerable variety of such need 
reductions, usually comes to be employed as a medium of exchange, 
i.e., money. This is possible because money, being an indirect 
means to need reduction, becomes a secondary reinforcing agent (ii). 

According to the present analysis, economics appears to be a 
kind of hybrid science inasmuch as it has its source in the applica- 
tion of a number of different primary or pure sciences. For example, 
the pure-science aspects of psychology or behavior have been 
stressed above, but this by no means covers the whole discipline. 
There are also the scientific aspects of production. In the case of 
agricultural economics there is involved in production the addi- 
tional primary science of plant growth, from which there flows the 
secondary or applied principle of diminishing crop yield of a plot 
of earth per unit of labor employed as the amount of labor spent upon 
it is indefinitely increased. This is the so-called Law of Diminishing 
Returns. 


A second type of value and valuation of great significance ig 
found in truth. In one sense a bit of truth may be specified as a 
statement whose symbols accurately correspond to their rcfcrcnis. 
Organisms strive for truth because it constitutes, or contains the 
means to, a dependable representation of selected portions of the 
environment. All organisms, particularly those with distance 
receptors, learn early in life to expose their receptors in ^ 
way as to receive the most adequate impact of environmental 
stimuli at critical points of behavior sequences. These habits are 
largely organized by compound trial-and-error learning (Chanir-r 
6) and maintained by means of the secondary reinforcempn, 
based on stimulus traces, the ultimate reinforcement bcine u? 
goal attainment which in general cannot be achieved without .i,^ 
exposure of the receptors to the environment. For cxamnlp 
case a marksman is at a place (or time) such that he cannot 
the necessary receptor exposures for t c appropriate obscrvaff 
a parallel observation made by a second person more favol 
situated can be conveyed to the first by means of hn.uZ 
symbols. Through the learned equivalence of stimulu, p ‘ 
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the Stimuli resulting from the language are approximately sub- 
stitutable for the needed but inaccessible stimulus pattern which 
would result from direct observation, and thus a hit may be made. 
This kind of truth may be called factual truth or information, as 
distinguished from misinformation, error, or untruth. That truth 
is valued is shown by the fact that it is widely striven for. 

Alternatively, truth of the natural-science theoretical variety 


may be defined as the characteristic symbolically formulated rule 
or principle, e.g., an equation, which applies accurately to certain 
types of relationship in such a way that when numerical values 
based on relevant observations are substituted in the equation there 
is secured a new quantitative value which agrees with fact. The 
value secured may be an end in itself, i.e., the satisfaction of 
curiosity; or it may lie in a subgoal, the final goal being the fulfill- 
ment of some primary need. Truth is striven for originally because 
it is a means to need reduction and so receives indirect, i.e., second- 
ary, reinforcement (4, p. 84). In this way theoretical-truth value 
is conceived to arise. Scientific truth is widely striven for. 
u r values of a still more complex nature arises from 

!v,^ ^ usually live in fairly close association, and thus 

the behavior of o^er people often becomes a matter of acute 
concern to each individual. This concern has two contrasted 
negative. In the positive aspect one organism 
tool 1 ^/° ? ^ second organism (the subject) much as a 

1 “d, ,o a state of 

to the inliHtn primarily or secondarily reinforcing 

subicct the i r ^ ^^compHsh this through the behavior of the 
TZtat " TT‘ nr, in accordance with th, laws 

adequate motivation 

latter’s need!; othertise''the" ™11 reduce the 

desired or will .iin- **'' subjects behavior will not be as 

who has amp^^y'lTS^r'H ^ 

promising some of the prepa e^ftJid 

reward. In this wav iJtl, ^ “ Potential reinforcement or 

primary reinforcement f receive nutriment and therefore 

of the^Law of Sr^aT ^^-don. This is an example 
social transactions. This law “"derlies all 

theorem: formulated as our final 
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THEOREM 133. Every voluntary social interaction^ in order to be 
repeated consistently, must result in a substantial reinforcement to the 
activity of each party to the transaction. 


This formulation implicitly presupposes the setting of ordinary 
respectable economics which is based on exchange. There is, how- 
ever, another side to the picture- The same principle operates in 
Situations where the initiator organism, or group of organisms, has 
sufficient power to resort to coercion. In the one case the initiator 
will create a need, usually primary, which would not otherwise 
exist; in the other case the initiator will prevent the reduction of a 
need already or potentially existent in the subject. This ugly 
phase of the control of behavior leads to slavery and forced labor 
as a limiting case. It also appears currently in various forms of 

racketeering. 


Through trial and error the subject organism often finds ways 
of preventing the occurrence of this type of behavior on the part 
of the initiator organism, or of terminating it if it is already under 
yay. One of these involves return punishment — the causing of 


injury to the offending (initiator) organism. Such acts of counter- 
attack are frequently reinforced because they cause the offender 
to take flight, thereby terminating the need he was causing. Sim* 
ilarly, the flight reaction is reinforced in the offender because it is 
followed by the cessation of the injury (need) caused by the counter- 
attack. Here again we have a case of reciprocal reinforcement. 

^t happens that certain signs such as frowns and other kinds of 
threatening movements, as well as certain words (overt threats) 
t ough their association with attack, acquire the power of evoking 
incipient flight reactions (fear). Through trial and error, habits of 
performing these social “pure stimulus” acts are acquired by 
°i"Eanisms, and they are used where effective in place of physical 
iittack. Accordingly words acquire a certain real power to punish, 
so to deter, transgressors. And since the statement that a 
person has transgressed in a certain way is associated withpunish- 
IJicnt and such a statement is a moral judgment, it comes about 
the overt passing of an adverse moral judgment becomes a 
l^ctcrrcnt to forbidden acts. In a similar manner, the passing of a 
.^^orablc moral judgment becomes a secondary positive rcinforc 
Qgcnt fostering desirable action. Because these effects arc rein 
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forcing, the passing of positive moral judgments becomes another 
example of reciprocal reinforcement. 

It is clear from the foregoing discussion that natural-science 
methodology presumably will be able, ultimately, to deduce from 
its principles all kinds of behavior of organisms, whether generally 
characterized as good, bad, or indifferent. Moreover, since the 
passing of a moral judgment is itself a form of verbal behavior, 
either overt or covert, it is to be expected that natural-science 
theory will be able to deduce the making of moral judgments 
along with other forms of behavior. 


Is a True Natural Science of Ethics Possible? 


At the outset of this discussion we must recall to the reader a 
commonplace which has been implicit throughout the preceding 
chapters the methodology of validating natural-science laws. Stated 
simply, the validation depends upon two factors: the conditions pre- 
ceding an event and the principles or laws upon which the outcome 
IS supposed to depend. The conditions are normally quantitative 
va ues such as the length of a pendulum suspension and the value of 
gravity; and the law in that case would be the equation, 


P = 2,rJi. 

is sound the substitution of any length of suspension 
nrfrV having a known gravitational value (g) will 

oSewc validation process is to 

the comn ^ Concrete pendulum agrees with 

mmt sSl™; ‘ laws and combinations oflaw 

must satisfy this type of validation test. 

methodroCTlSt statement of scientific 

attaining a^olar theo^ of “ “f ultimately 

all aspects of th^ * • o*’ganismic behavior which will cover 

organisms will do VX^aU ’‘“'"u 

possible, ultimately, to predim thf 

make, i.c., what they wfll . * which people 

approval or disapproval of tfe bih ^ 'overtly, regarding their 
own behavior. Therefore the o*"® as well as of their 

will ultimately apply ,o inoralTh 

behavior, even including the moral 
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judgment. Moreover, it is equally clear that such a theory when 
worked out will be capable of being proved valid or invalid by the 
empirical test of observing what really happens in a behavior 
causal sequence following the occurrence of any dynamic conditions 
to which the theory applies (4, p. 12). 

But here we encounter a critical question, one concerning which 
there is a great deal of current confusion among both scientists and 
ethicists. Is pure science’s methodological capacity to mediate the 
prediction (the logical deduction) of the occurrence of an event 
under given conditions of behavior of whatever nature — whether 


moral, immoral, or unmoral — the same thing as the capacity to 
make a moral judgment, i.e., to characterize certain behavior abso- 
lutely as ethically good or bad? As so often happens, the clear 
posing of a problem furnishes us with valuable clues to its solution. 
The clue in the case of the present problem is the distinction 
between prediction and characterization. No ethical system known to 
the present writer attempts to predict the occurrence of any event 
whatever: the “laws” which are proposed are merely principles 
Ibr characterizing acts as good or bad, as a basis for making moral 
judgments. A moral judgment, like any other act, may serve as a 
test of the validity of supposed behavioral law but it does not itself 
state a law any more than any other ordinary act does. True natural 
laws have no exceptions. 

^oes this difference between ethical theory and the theory of 


moral behavior mean that ethical principles inherently can never 
have the type of validity that the scientific theory of moral behavior 
may have? We believe that the considerations just outlined leave 
us no alternative. So long as ethical theory only mediates the 
^^racterization of events as good or bad, statements of what men 
ought or ought not to do, and never predicts the occurrence of any- 
thmg on the part of the subject, there can be no objective scientific 
Jest of its truth or falsity; i.e., there is no scientific way of dctermin- 
mg its validity. 

But statements which cannot be tested for truth or falsity cannot 
^ said to be either true or false. This means that such statements 
occupy a scientific no man’s land; which is practically cn 

^/aying that such statements arc scientific nonsense. P^obab^ 
is the reason why men who are familiar with the tec 
of science, by and large, arc able in the course of umc to a * 



340 


A BEHAVIOR SYSTEM 


substantial agreement in regard to scientific matters but as a rule 
make little progress toward agreement in regard to matters of 
moral judgment, where serious concrete issues are involved. It 
follows that the so-called science of ethics, so far as ultimate ethical 
values are concerned, is a pseudo-science.* Meanwhile this presents 
no impediment in the way of the development of a true natural 
science of moral behavior, including the moral judgment as an 
act that is concerned with events which may be predicted and 
publicly observed. Neither does it impede the application of 
science in the determination of the most effective means of attain- 


ing values of all kinds, ethical or otherwise. 

By much the same reasoning we may show that the hope of 
somehow deriving ethical principles from the innate constitution 
of the “mind,” on the analogy of the “self-evident” truths of logic 
and Euclid’s approach to geometry, is also doomed to disappoint- 
ment. This is because there probably is no such thing as a self- 
evident truth in Euclid’s sense. The primary principles of logic 
and mathematics are believed to be those rules of reasoning 
(s^bol manipulation) which have been found by trial to mediate 
valid conclusions. The formulation of these principles has taken 
centuries and is by no means complete even now. Scientific theory 
requires or the derivation of valid theorems (1) sound scientific 
principles and (2) sound logical rules for the mediation of the 
process. Therefore each empirically verified scientific 
in lts'^r-v"f* scientific principles employed 

doles wereV°" r" ’°B>ral rules whereby the scientific prin- 
da,Sir,h Logical rules are vali- 

scientific 

Uor^ Ac'mlndv”, “'at the innate constitu- 

scicnUfic validation oVuliL^iretSpn"^^^^^ 

An Oblective Naturol-Scienee Inl^mr-.. .• 

of Some Typical Approaches 
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same phenomena there should be a very substantial identity in 
the several systematic outcomes. This seems to be true in the case 
ol value theory. We believe that various current approaches to the 
t eory of value are essentially alike in that each in turn takes its 
origin from a position which is substantially identical to one or 
anot er phase of the logico-causal hierarchy of the natural-science 
approach implicit in the preceding chapters. This approach may 
e suminarized as follows, the Roman symbols in parentheses 
representing the relevant postulate of origin; 

'• Original need (I; V). 

2. Substance or state of affairs (K') possessing power of mediating 

need reduction (VII). 

3- Original need reduction (III). 

Resulting habit formation («H») (IV). 
n- Subsequent need or drive (D) (V), 
n- Reaction (striving) potentiality (bEr) (VIII). 

striving or work (W) (objective valuative behavior) 


syst^^ begin our examination of the origins of various value 
hvn^^ • citing an interpretation of Bentham’s pain-pleasure 
ypomesis, which goes back to 1780 (7, pp. 339, 353): 

2^re has placed mankind under the governance of two 
sovereign masters, pain and pleasure , . . pleasure, and what 
same thing, immunity from pain . . . 

Cone* concept oipain is equated substantially to our own 

in th Bentham’s concept of pleasure is found 

pato ^**^^dons in which need or anxiety (the learned antici- 
of red" ^be impending impact of a need) is in the process 

the shown at length in the preceding chapters, all 

'vhich is the immediate observable factor indicating 
Thus B ” from hEr, i.e., ultimately from need reduction. 

2 ^ takes his point of departure approximately from 

^°Sico-causal hierarchy. If he were writing to ^y 
r j°*^*^civably say that value, or K (2) becomes rnani cst 
of need (1) through its power of need reduction 
^^'^^ction generates habit (4); that habit in ‘'^"■1“”^ 
Which « ^^bsequent need (5) generates striving potentiality ( ), 
the striving (7) and which normally constitut« 

J*^tive evidence of evaluation. There accordingly appears 
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substantial agreement in regard to scientific matters but as a rule 
make little progress toward agreement in regard to matters of 
moral judgment, where serious concrete issues are involved. It 
follows that the so-called science of ethics, so far as ultimate ethical 
values are concerned, is a pseudo-science.® Meanwhile this presents 
no impediment in the way of the development of a true natural 
science of moral behavior, including the moral judgment as an 
act that is concerned with events which may be predicted and 
publicly observed. Neither does it impede the application of 
science in the determination of the most effective means of attain- 
ing values of all kinds, ethical or otherwise. 

By much the same reasoning we may show that the hope of 
somehow deriving ethical principles from the innate constitution 
of the ‘mind,” on the analogy of the “self-evident” truths of logic 
and Euclid’s approach to geometry, is also doomed to disappoint- 
ment. This is because there probably is no such thing as a self- 
evident truth in Euclid’s sense. The primary principles of logic 
and mathematics are believed to be those rules of reasoning 
(s^bol manipulation) which have been found by trial to mediate 
valid conclusions. The formulation of these principles has taken 
centuries and is by no means complete even now. Scientific theory 
requires for the derivation of valid theorems (1) sound scientific 
pnnciples and (2) sound logical rules for the mediation of the 
ttiff. ^ P™ccss. Therefore each empirically verified scientific 
in validate both the scientific principles employed 

cinlcs the logical rules whereby the scientific prin- 

dated in firmed into the theorem. Logical rules are vali- 

to 'the of Vali,'."" S®"" Typical Approaches 

from the same intelligent scholars 

giving an account of practically the 

"rtic icTTn fiSus is employed < v ' 

Vfhat abM>)utely U good or bad as d' * *”*■ »enje of the alleged science of 

cultural groups say h good or bad ‘**‘“8uished from what particular individuals or 
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for goals, is derivable by a process of learning which has its roots 
in need and need reduction. 

Next in order we examine the approach of Perry. His key con- 
cept in value theory is interest. But Perry does not use this term as 
equivalent to attention. In this connection he remarks (9, p. 115): 
It is characteristic of living mind to be for some things and 
against others ... It is to this all-pervasive characteristic of 
the motor-affective life, this state^ act^ attitude, or disposition of 
favor or disfavor, to which we propose to give the name of 
interest” . . . That which is an object of interest is eo ipso 
invested with value. 

In short, Perry uses interest as substantially equivalent to interested 
action, or striving. Thus he takes his point of departure from levels 
6 and 7 of the present natural-science logico-causal hierarchy. It 
IS to be noted, moreover, that Perry clearly recognizes that a process 
of habit formation does take place and that striving consequently 
has its roots in the history of the organism, though he makes no 
attempt at a precise derivation of striving from original biological 
needs. 

_ The last systematic treatment of the problems of value and valua- 
tion which we shall consider is that of Dewey, who advocates a 
strictly natural-science approach. He states, for example (2, pp. 
63, 64): 


The separation alleged to exist between the “world of facts 
and the “realm of values” will disappear from human beliefs 
only as valuation-phenomena are seen to have their immediate 
source in biological modes of behavior and to owe their con- 
crete content to the influence of cultural conditions. ... A 


grounded theory of the phenomena of human behavior is as 
much a prerequisite of a theory of valuation as is a theory 
of the behavior of physical (in the sense of nonhuman) things. 
The development of a science of the phenomena of living 
creatures was an unqualifled prerequisite of the development 
of a sound psychology. , i 

* evident from this quotation that Dewey’s general 
*kc problems of value and valuation is substantially i c ^ 

^ our own. He docs not go into the specific details o c 
^ory, but stresses the role of subordinate goals and the 
implex processes of valuativc procedures such as emp oy 
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to be substantial harmony between the present systematic approach 
and that of Bentham. 

Next we consider a much more recent work, that of Urban. 
Urban’s critical value concept is feeling.^ He says (72, p. 22): 

Existence is perceived; truth is thought; value is felt . . . 

The feeling of value includes the feeling of reality. 

Urban’s basic postulate is not really very different from Bentham’s, 
because for Urban feeling is affect. But affect is the pleasantness or 
unpleasantness aspect of stimulating situations, and pleasantness 
and unpleasantness are essentially pleasure and pain. Accordingly, 
Urban’s theory of value may be regarded as taking its origin from 
substantially level 1 of our own development, exactly as may 
Bentham’s, and for the same reasons. 


We pass next to a group of writers among whom arc found most 
modern economists. Their approach derives value from wants {3, 
p. 1). This notion, while somewhat vague, appears to be very 
nearly equivalent to need except that the emphasis is naturally 
placed on specific objects or commodities wanted, on the one hand, 
Md on potential action to obtain the commodities on the other. 
When they say that a person wants bread it is equivalent to saying 
1 ^ need which bread has the power of reducing, and 

y imp ication that he possesses an internal habit structure which 
under appropriate stimulation will lead to striving with bread (the 
eating of bread) as the goal. It is accordingly evident that this 
00 sets out from a place in the natural-science logico-causal 
y from primitive needs to value and valuation, 

ihjit or n hierarchy at a tabular level considerably below 

A * Urban probably somewhere near level 6. 

from same level in our logico-causal hierarchy 

“reninrr-rl ° Kohler (7), who derives value from 

Kohler ° understanding, requiredness for 

need Th ^ incompleteness or 

of reaction Kohl' 

finds its . 1 . j approach accordingly also 

logicoHtausal hierarchy wt'ltereforr'' d °“h 

n«.vec.or.atleastJofarasutrpa'h“^^^^^ 

• The same may be said of Reid fM^ « 

Urban. '* more recent writer somewhat influenced by 
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and suitable equations were available. Consequently, introspective 
reports concerning internal conditions are useful for rough quali- 
tative purposes; nevertheless they become inadequate wherever 
primary quantitative laws are in the process of systematic formula- 
tion or precise validation. Fortunately, as we have tried to show 
above, in the formulation of natural law it is not necessary to 
depend on such unsatisfactory evidence. We can utilize symbolic 
constructs. 

As quantitative behavioral symbolic constructs are gradually 
perfected and come into more general knowledge and use, the 
insistence of value theorists upon the logical primacy of introspec- 
tion may be expected correspondingly to diminish. Then the theory 
of value will cease to be a division of speculative philosophy and 
vdll become a bona fide portion of natural science. 

References 


1- Bentham, J. An introduction to the principles of morals and 
legislation. British Moralists (ed. by L. A. Selby-Bigge), Vol. I. 
Oxford, England: Clarendon Press, 1897. 

2. Dewey, J. Theory of valuation. International encyclopedia of unified 

science^ Vol. II. Chicago: Univ, Chicago Press, 1939. 

3. Fairchild, F. R., Furniss, E. S., and Buck, N. S. Economics. New 


York: Macmillan Co., 1937. 

4. Hull, C. L. Principles of behavior. New York: D. Appleton- 
Century Co., 1943. 

5- Hull, C. L. Value, valuation, and natural-science methodology. 

Philos, of Science, 1944, 11, 125-141. 

6- Hull, C. L., Felsingcr, J. M., Gladstone, A. I., and Yamaguchi, 

H. G. A proposed quantification of habit strength. Psyc o . 
Rev,, 1947, 54, 237-254. ^ . . 

7. Kohler, W. The place of value in a world of facts. New York- 
Liveright Pub. Corp., 1938. . - . 

• Pcrin, G. T. Behavior potentiality as a joint funcUon o 
amount of training and the degree of hunger at c ti 
extinction. J. Exper. Psychol., 1942, 30, 93-1 13. 

Perry, R. fi. General theory of value. New York: Longmans, 
and Co 1926 

>0- Reid, J. R.’^ theory of value, New York: Charles Scribner s Sons. 

1938. 



344 


A BEHAVIOR SYSTEM 


pure-stimulus acts of verbd symbolism as mediating devices. His 
point of origin can scarcely be assigned a particular place in our 
formal natural-science hierarchy, since implicitly he recognizes the 
whole of it. By emphasis, however, he seems somewhat to favor the 
aspect of objectively observable action which we have listed as 
level 7. 

Terminal Note 


THE SYSTEMATIC STATUS OF INTROSPECTION IN THE 
NATURAL-SCIENCE APPROACH TO THE THEORY OF 
VALUE 

There remain to be examined certain differences among the ap- 
proaches to value theory just considered as to implicit or explicit 
methodology. All of the writers mentioned, with the exception of 
Dewey and the probable exception of Perry, take more or less the 
subjective, introspective, or phenomenological approach to the 
t eory of value, one of them (Kohler) somewhat insistently so. 
rom t e present point of view the subjective states such as pain 
^ characteristic internal conditions and are observ- 

able by means of internal receptors. These receptors discharge 
mot e nervous system quite as do the external receptors (such 
as the retina and the cochlea) and so in different combinations are 
e re^onses of various kinds, including those of verbal 
judgment ^ constitute introspective reports and valuative 

orirani^rt'^ ^Ppear that the presence of internal conditions or neural 
tiab ( E ^ structures (bH«) and reaction poten- 

the verbal reportable. It is not clear, however, whether 
by direct conn^rt'*^ constitute such reports are mediated 

the DronriorrntUf *• °”"cctions are between the effectors and 

fiction mediated dir wdy' h^h' tendencies to 
all events ^ habit structures in question. At 

qlnUy meM e -"re such manner are fre- 

lively rnetricizW hism,^ onhe 

time or energy would not be ava iabIe“t -’“' v 

concernintT u u- available to make exact calculations 

concernmg relevant hab.t structures even if an adequate history 



12. Concluding Considerations 


In our final chapter we wish to emphasize three types of related 
conclusions which seem to flow from the preceding theoretical 
elaborations. These concern the joint automaticity and adaptivity 
of the behavior forms deduced in this volume, the scientific sound- 
ness of the detailed behavior forms, and the additional behavior 
forms which will probably be deduced from the same general 
system in the not too distant future. 


Sample Automatic Adaptive Behavior Mechanisms 

Throughout the preceding pages we have been so largely con- 
cerned with the informal deductions which make up the bulk 
of this volume that we have taken no space to present our view of 
the biological (adaptive) picture as such. Even so, the reader has 
seen by now that the organism is here conceived as a completely 
automatic entity; that in our approach to behavior theory there 
no enttlechy, no disembodied mind, soul, or spirit which in some 
'vay tells the various parts of the body how to cooperate behavior- 
to attain successful adaptation, i.c., how to achieve sumval. 
’“hen the various laws governing this behavioral automaticity are 
completely known they presumably will be presented in full detail, 
'hat time these laws should be stated objectively at the outset, 

• Munn (.1. p. ,50) „„ believe m diicmbodir d id.:.., l l^^- 

bdiwior being made up of learned co,nple« 

•utQr« *hc$c responses arc simple or complex. It .nmailcitv that 

u quite as automatic and self-regulating as «mp c au „i,’tpnee 

S y complex automaticity constitutes no more evidence ^ 

cntelethy, or reason for indulging in anthropomorphtsm. than does P 
“lomatjcity. 


3i7 
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to reduce Sd; we accordingly call it R_. The fact that R_ requires 
work (W) and that work generates (Ir) naturally leads to experi- 
mental extinction, which ultimately reduces the potentiality of 
8Hr_ to the response threshold (bLr). This may be called negative 
response learning; it protects organisms from exhausting themselves 
in performing useless acts. Thus we have our fourth major automatic 
adoptive behavior mechanism. 

Now let us suppose as before that the dominant response of the 
bUr hierarchy chances to be unadaptive (R^), but that the second 
next strongest response potentiality is truly adaptive; we therefore 
call it R_,.. Here enters the principle of behavioral oscillation (bOr), 
through which, with (or without) negative response learning, an 
irregular alternation occurs between R_ and R-f. resulting in trial 
learning (XII). This is the combined occurrence of negative and 
positive response learning in the same process; it is commonly 
known as trial-and-error learning; as the irregular alternation of 
the trials continues R_ grows weaker and becomes less frequent, 
'vhile grows stronger and becomes more frequent. This trial- 
and-error learning constitutes our fifth major automatic adaptive be~ 
^vior mechanism. 


We now pass to a second form of joint positive and negative trial 
earning. Let us suppose that an adaptive response is conditioned 
to a given point on a stimulus generalization continuum, but that 
t e stimulus at a different point on the continuum operates on the 
Organism to evoke the same response, which in this situation is 
^'ladaptive. Here we have a case where a primary behavior law 
^ncralization) produces a major but temporary maladaptivity. 
t ne same response will become R+ or R_, depending upon which 
part of the same stimulus continuum is operating. In order to 
^■^oid the superficial paradox of calling the same response both 
^ and according to the evoking conditions, we shall now 
P^os and minus signs to the stimuli, as S+ and S.— 

. . ^3ladaptivity is easily and automatically remedied. c 
joint strengthening of S+ R at one point of the continuum and 
^■cakening of S_ R at a different point on the continuum, to- 
ecthcr with the principle of stimulus generalization, will ^ 

, • ° fcaction-cvocation power in favor of this s'tthon ° ' 

mulus continuum for this particular response, which will fina > 
^«lucc the S_ ... R to the reaction tiu-cshold (.L.)- Tins is proper!) 
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perhaps after the general manner of Chapter 1 in this volume. 
We are at present a long way from knowing these laws with pre- 
cision, and some of those which we think we know almost certainly 
will later prove to be in error. Nevertheless it may help to fill out 
the reader’s picture of the adaptive aspects of the present system 
if we sketch in a tentative manner typical aspects of its automaticity. 
This will consist in the brief presentation of eight examples of 


adaptive automatic behavior mechanisms. 

Organic evolution has provided the normal organism at the 
beginning of its life with (1) receptor organs and (2) responding 
organs. The two types of organs are similarly connected by the 
nervous system to form unlearned stimulus-response connections 
or reflexes (Postulate I). These inborn response tendencies (bUb) 
are the body’s first major automatic mechanisms for adapting to various 
types of emergency situations. 

But the processes of evolution have definite limits. The organism 
as not been provided with ready-made reflexes for evoking 
^ responses to the infinity of complex situations in which it 
wi d itself. To meet this type of emergency, evolution has 
developed a second automatic device. This is the primitive capacity 
to learn; to profit by past experience (HI). Learning thusconsti- 
tutes second major automatic adaptive behavior mechanism, which pro- 
s ^ 8 ig t y slower means of adaptation to less acute situations, 
c earning iwelf is seen in the conditioned reflex. A neutral 
lowed hv 1 ^ response (R) to an injury which is fol- 

S -♦ R ; e (So) tends to set up a learned habit 

S of the haK-* *’-11 stimulus, or set of stimuli. Now the 

which shortly rrcccTeTthl^ - a situation-stimuli 

stimulus gencmlization P™ciplc of 

tion will on j- therefore, the defense withdrawal ac- 

occasion, and so will'reduc””°1^ 

The learning law counled automatically, 

yields the ont, dating dfL,, 

Here. then, we have our tidaptive. 

mechanism. major automatic adaptive behavtor 

of somewhat'div”™'^™”''^'®*'^ consists in a hierarchy 

the strongest response MtcmiaT.^'‘'?T suppose that 

po ntiality of the hierarchy chances not 
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The Test of a Sound Theory 

At this point we pass to our second concluding consideration, that 
of the scientific soundness of the deductions yielded by the system. 
Two general types of procedure are followed by those who attempt 
to evaluate the validity of a theoretical system. The first method, 
representing the German philosophical approach (J, pp. 69, 684, 
685), begins in a negative manner by marshalling a priori arguments 
designed to reveal the fallacies of potentially conflicting approaches; 
It then proceeds to defend the conclusions arrived at in the system 
being evaluated by showing their general harmony with some 
metaphysical principle or dogma. Our own method, on the other 
hand, is patterned after the objective procedures of the physical 
sciences. Those who follow this approach take the view that the 
basic criterion of the soundness of a theoretical system is the extent 
to which the deductions from the system correspond to empirical 
[act. In the present immature state of the behavior sciences, the 
Importance of this method of evaluation cannot be too strongly 
stressed. 


In scoring the various theoretical deductions of the system pre- 
sented here, we have prepared a special summarizing table for 
each chapter which contains such deductions. A typical example 
of these tabulations may be seen in Table 37, which represents 
^hspter 2. This table gives the total number of theoretical propo- 
sitions presented (in this table, 22), and indicates whether or not 
relevant empirical evidence has been found regarding the validity 
of each and in case it has, whether it is judged valid (+), uncer- 
t^nly valid (?), or invalid (-). The opinions of individual scien- 
tists will, of course, differ in such matters; each scientist who knows 
me empirical field will wish to arrive at parallel judgments for 
himself. 


^be results of the validity tables of all the ten chapters thus 
[fored were then combined. Of the 178 formal theoretical pro^si- 
contained in the volume, 93, or 52 per cent, ^vere judged as 
empirical evidence bearing on their validity (d*)* » 

7 per cent, were judged as possessing approximate or 
^"iPirical evidence (±) as to their validity. Fifty-five, or 31 

of the 178 theoretical propositions were judged as not cohered 
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described as posilive’Tiegative stimulus trial learning. It is usually 
known as discrimination learning. This tilting of the evocative power 
of the stimulus continuum by the organism’s learning S+ and S_ 
is obviously a sixth major automatic adaptive behavior device. 

We next consider a stimulus continuum which we call the 
stimulus trace (s). This, through decay, has a natural tilt downward 
beginning soon after the stimulus is received and extending down- 


ward for some seconds. In this way a stimulus through its trace 
may be conditioned to a response some seconds after the physical 
stimulus has ceased to exist. This continuum yields generalized 
responses in both directions, but notably from the low subsident 
end of the trace toward the relatively high antecedent end. In 
these circumstances the response will antedate the conditions under 
which the habit was set up; i.e., this continuum and combination 


of circumstances yield a second type of antedating dejense reaction. A 
defense reaction such as flight, which occurs before a dangerous 
event is encountered, clearly constitutes our seventh major automatic 
adaptive behavior device. (Incidentally this mechanism automatically 
spans time for the organism.) 

Now we come to the behavioral mechanism known as the frac- 
tional antedating goal reaction, together with its proprioceptive stimulus 
correlate, ro Sa. The ro is a pure-stimulus act (©) which tends 
to antedate all goals established by a given organism. It follows 
that the proprioceptive goal stimulus (sc) will automatically pre- 
ce c each such goal, as well, of course, as the acts by which the goal 
as a rea y een attained. Thus each So is a stimulus leading to the 
e P^iicular goal. Clearly the automatic (stimulus) 
°*^Sanismic behavior to goals is adaptive in the highest 
will major automatic device presumably 

rca-soninrr ^ u- , understanding of thought and 

evolntinn ' A highest attainment of organic 

manner ' the ro sa mechanism leads in a strictly logical 

mvchic- intV ^‘^earded as the very heart of the 

purpose, and^o™'! TOs"ourTlrt expectancy, 
j , ’ eighth major automatic adaptive behavior 

Ixl’nl r h --hanisms presented to 

«If.n,aintenance of the mammalian 
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covered by empirical fact will in the course of time be investigated. 
These are genuine predictions, and experimentalists seeking fertile 
fields for investigation will find in them challenging targets for 
res^ch. It will be particularly interesting to see what per cent of 
validity the predictions will show as experiments gradually produce 
the relevant evidence. 

They who know the history of theoretical psychology will under- 
stand that the present system is merely the most recent of a series 
of miniature systems evolved by the present writer. The coming 
generation of scientists will, it is hoped, present other theoretical 
systems, each succeeding one of a progressively more precise and 
quantitative nature. 

Theoretical Behavior Challenges of the Near Future 

We come now to our third concluding consideration — those 
behavior forms likely to be deduced soon. During recent years 
the physical sciences have been developing with a marked positive 
acceleration. Present indications are that the empirical behavior 
sciences are manifesting the same type of growth. One characteristic 
of this is that the theoretical or systematic development will follow 
'vith not too much delay the empirical growth. This means that 
we may confidently expect a number of obvious systematizations 
during the next fifty years or so. The successful development of the 
belmrioral sciences will be hastened by the early solution of a few 
^ical problems which we shall now consider. 

One of the factors which retard both the empirical and the 

^retical growth of a science is an inadequate vocabulary or 
^of signs by means of which the main concepts may be designated. 

^ S^neral technique of symbolic logic, when separated from ite 
Metaphysical entanglements, seems admirably adapted to this 
service. In the development of this technique much care must be 
^oted to the choice of the primary or undefined terms, so that 
meanings can be made public and objective by simple sensory 
^onstrations and/or discriminatory diffcrcndal reinforcements, 
a satisfactory set of undefined terms available, all the ot cr 
of the system should be defined in those primary terms 
^ *bc system develops. The point is that the terms or concepts of a 
can, and should, be built up systcmafically much like the 
°*^mal propositions or theorems. 
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by known relevant empirical evidence. Of the 123 propositions 
wholly or partially covered by empirical evidence, 106, or 86 per 
cent, were judged as substantially validated; 14, or 1 1 per cent, were 
judged as probably valid, though with considerable uncertainty; 
and one proposition (related to the VVeber-Fechner law), or about 
1 per cent, was judged as definitely invalid. 


rABLE 37. A typical validity summariamg table for Chapter 2. This shows (1) 
whether definite (direct) empirical evidence exists (+), approximate (indirect) 
empirical evidence exists (±), or no empirical evidence ( — ) has been found bearing 
on the soundness of the theoretical deductions in question; and (2) whether the 
empirical evidence found is judged to support (+) or not support (— ) the particular 
deduction. A question mark indicates special uncertainty of judgment. Similar 
tables were made for Chapters 3 to 11 inclusive. 

Theoretical Relevant evidence found Judged empirical 
conclusions and/ bearing on empirical sound- soundness of theo- 
_ or parts of deduction retical deduction 



to be definitely contrary *23 was considered 

some extent the writer’s'^mrn"'^‘"“ reflects to 

would not yield readily to ^''°>'*ance of problems which 

reader will LserJe thM the 

of phenomena. Moreoyer thl sT ''“6= 

» ->-> propositions presumably not yet 
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symbolism. Perhaps because of this connection between speech and 
in^ospective reports, the Gestalt psychologists have specialized in 
this field, and have made notable contributions to it (J). In the 
present work it is conceived that the whole subject should be re- 
worked from a behavioral point of view, and that the various laws 
peculiar to perception should be deduced in terms of sHr, sAr, 
D, S, R, and so on. Thus a real scientific unity would be attained 
(/). We are under the impression that several persons in various 
parts of the world are spontaneously considering undertaking this 
task. Publications by the following are suggestive: Kenneth W. 
Spence, of Iowa State University (6); James S. Taylor, of the 
University of Cape Town, South Africa; Harold Schlosberg, of 
Brown University (5), and Daniel E. Berlyne, of St. Andrews, Scot- 
land (7). 

A second division of individual psychology which urgently needs 
to be formally systematized and incorporated in the body of a 
behavior system is that concerning the emotions. Fortunately in 
the systematic work of Brown and Farber (2) we have an excellent 
groundwork for the scientific treatment of this hitherto elusive 
subject. 

A third division of individual psychology which has occupied 
serious minds from the earliest ages is concerned with the detailed 
tnechanisms of abstract reasoning. The expectation of an early 
^nd radical solution of this ancient group of problems lies in the 
^udy of speech movements considered as pure-stimulus acts (©)■ 
n this assumption, logic would become a set of rules by which 
abits of manipulating verbal pure-stimulus acts eventuate into 
'^alid motor adjustments to various life conditions. The subject 
fatter of Chapter 10 may be considered as a tentative gross- 
c^vioral approach to this great subject. 

nd finally, the crowning achievement of all will be the creation 
? ^ _really quantitative system of social behavior. Social psychology 
in individual psychology because the latter furnishes us 
the skills which are employed in social intercourse and 
^unication, and necessarily must precede to some extent, ut 
^^ofid conditions are crying loudly for a really scientific sy'stem oI 
^ *'^^cr-organismic behavior of groups. It seems incredib e t at 
3ture Would create one set of primary' scnsor>’-motor mv-s o 
^ *^ediation of individual behavior and another set or 
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Closely related to the terms upon which a system is built is the 
matter of the behavioral units employed. All scientific systems of 
importance must be quantitative; quantification requires units, and 
systematic quantification requires a most meticulous definition of 
the units in their various relationships. For example, it should be 
possible to convert ten units of one motivation into an equivalent 
number of units in any other of the numerous types of motivation, 
so that all will yield strictly equivalent amounts of potential, or 
real, adaptive action (R). This will be an exacting task, probably 
extending over a very long time. The small and tentative beginning 
made in the present system (<r) will serve mainly to call attention 


in a concrete manner to the problem. 

The behavioral units employed are closely related to the matter 
of the quantitative equations representing the relation of the 
various behavior functions, such as ,Hk, D, ,I„, and so on, in 
the present system, to the numher of reinforcements (N), the 
ength or amount of food privation (h), the number of extinction 
la s fn), an so on. In the midst of these problems is the critical 
"“"''rical values of the constants or 
nhvsM^-"'^''^-'!!?" equations. The history of the 

nlished 'his presumably will be accom- 

problem U ^ ° approximations, but that even though the 

development towa ^ ?™h'ative aspects of systematic behavior 
confidence, we turn firs^t ’°?h forward with considerable 

(f). Now, perception anr, ° "'aditionally known as perception 
(S), together with h®sed on sensory stimulation 

ous learning (sHr) generalization, and the results of previ- 
have been able to" aeWeve S’faT ‘"'•''eshift we 

separately. Perhaps one of the ^ ® 

tinguish sharply between failure to dis- 

interfere any more markedlvi^^ fu°" Perception does not 

tions is that the elements of Lth present deduc- 

explicitly included in the generalization are 

Perception has ordinarily been reported by means of speech 



Glossary of Symbols 

A = amplitude; a constant; distance between two objects in j.n.d.’s. 
® ~ empirical constant; an incentive substance (water). 

B = mean number of reactions in a response cycle; mean number of 
responses per alternation cycle; exponential constant, 
b = empirical constant. 

C = the larger habit strength in behavioral withdrawal, G bHr = 
bHh; 

the larger reaction potential in behavioral withdrawal, C — 
bEr = bEr. 

Cd = condition producing a drive. 

D = drive; primary motivation; need; emotion; effective or gross 
drive; D = D' X «. 

= drive proper. 

d = difference between two stimuli; difference between the loga- 
rithms of two stimuli. 
bEr as reaction potential. 
bEr * some other bEr. 

bEr^ s reaction potential of the “correct” reaction. 
bEr, as reaction potential of the “incorrect” reaction. 

+bEr = reaction potential of adient reaction. 

“bEr = reaction potential of abient reaction. 
bEr ~ generalized reaction potential; effective reaction potential; 

generalized superthreshold reaction potential. 
bEr =: net reaction potential; sEr = bEr — bIr- 
bEr == net discriminatory reaction potential; maximum reaction 
potential at the point of original learning. 

^ superthreshold portion of reaction potential; bEr = sEr sLr. 

** uiomentary reaction potential. 
b£<r = superthreshold reaction potential, bEr **■ 

^ *= food reinforcement; uniform factor of reduction of reaction 
potential; mean number of uninterrupted sequences, mean 
number of responses per alternation. 

^ an incentive substance (food); function of ( )• 

goal; goal object. 

Boal-gradient index. 

■ • goal-orientation index. 

• R habit; habit strength. 
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mediation of group behavior. Presumably, then, the laws which are 
derived for social behavior will be based for the most part on the 
same postulates as those which form the basis of individual behavior. 
If this turns out to be true, we are even now an appreciable dis- 
tance on our way toward the ultimate goal of integrating the 
individual-social sciences with the group-social sciences. 


1. Berlyne, D. E. Attention, perception and behavior theory. 

Psychol. Rev., 1951, 58, 137-146. 

2. Brown, J. S. and Farber, I. E. Emotions conceptualized as inter- 

vening variables — with suggestions toward a theory of frus- 
tration. Psychol. Bull., 1951, 48, pp. 465-495. 

3. Kofflca, K. Principles of Gesloll pychohg^. New York: Harcourt 

Brace and Co., 1935. 

4. Munn, N. L. Fundamentals of human adjustment. New York: 

Houghton Mifflin Co., 1951. 

Schlosberg, H. A note on depth perception, size, constancy, and 
related topics. Psychol. Rev., 1950, 57, 314-317. 
pence,^ K. W. Cognitive versus stimulus-response theories of 
learning. Psychol. Rev., 1950, 57, 159-172. 



GLOSSARY OF SYMBOLS 


359 

R+ = appropriate, correct, or right response; a response that is 
reinforced. 

R_ = inappropriate, incorrect, or wrong response; a response that is 
extinguished (by non-reinforcement). 

Rg = consummatory response; reinforcing state of affairs; antedating 
goal reaction. 

0 = pure-stimulus act in general. 

Tg = fractional antedating goal reaction; a concrete pure-stimulus act. 

S = stimulus; stimulus energy; stimulus intensity. 

S = theoretical stimulus intensity which is functionally equivalent to 
a given molar afferent impulse; equivalent stimulus trace 
intensity; another stimulus. 

= a stimulus aggregate that precedes a reinforced reaction (R+). 

S- = a stimulus aggregate that precedes an unreinforced reaction 

(R-). 

S+ stimuli originally conditioned to the reinforced reaction (R+). 
“ stimuli originally conditioned to the unreinforccd reaction (R-)* 

S' theoretical recruitment phase of molar afferent energy impulse. 

S' « theoretical subsident phase of molar afferent energy impulse. 

Sd drive stimulus; need; drive intensity. 

Snh « drive stimulus due to hunger. 

Sd, as drive stimulus due to thirst. 

Sq =s goal stimulus. 

s = neurophysiological afferent impulse evoked by S; the trace of 
the stimulus afferent impulse. 

s' = theoretical molar afferent impulse corresponding to s; molar 
stimulus trace intensity; s' = log S . 

= theoretical recruitment phase of molar afferent Impulse. 

s = theoretical subsident phase of molar afferent impulse. 

S *= afferent impulse as modified by afferent interaction. 

^ = fractional goal stimulus; proprioceptive stimuli resulting from 
Tg; proprioceptive goal stimulus; fractional antedating goal 
stimulus. 

^ ^ time (usually in seconds); duration; delay in rcinforccrncnt. 

t = time since the termination (or b^inning) of a stimulation. 

1 *= time since the maximum of the recruitment phase o a slim 
trace; t' = J — .450". 

•tn = reaction latency; reaction time. 

* * Unlearned receptor-effector connection. 
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sHr = some olhcr habit. 

bHr = generalized habit strength; habit strength resulting from stimulus 
generalization. 

h = hours of food privation; length or amount of food privation. 

Ir = reactive inhibition. 

Ir = reactive inhibition remaining after a period of spontaneous 
recovery. 

Ir = Ir "h sin. 

Ir = bEr; enough Ir to neutralize the supcrthrcshold reaction 
potential. 

bIr = conditioned inhibition; inhibitory potential. 

bIr — generalized inhibitory potential; generalized conditioned inhibi- 
tion. 

i = exponential constant. 

J — the influence on reaction potential reduction caused by the delay 
in reinforcement represented by: J « * D X Vj X K 

X bHr X 10~ X Vi; it also represents an earlier 
empirical fitted approximation (J * 

^ J *= an empirical exponential constant. 

l.n.d. = just noticeable difference; discrimination threshold. 

V' — of reaction potential; incentive motivation. 

ts. ~ the physical incentive or reward in motivation. 

R minimum reaction potential evoking reaction; reaction thresh- 


learning maximum; maximum of reaction potential, 
m - exponential constant. 

N = number of reinforcements in general. 

er of reinforcements from the beginning of learning, i.e. 
from absolute zero (Z) & 

n ; nZb"' reinforcements. 

o unreinforced reaction evocations required to produc< 

^ experimental extinction. 

^nreinforced reaction evocations at a giver 

sOa = momentary behavioral oscillation. 

P+ = probability of occurrence of th- .. 

P- = probability of occurrence of the ’ ■'‘==P°"“- 

I = response; an act of some 11 ^* 

R = wrong or unadaptive response. 
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V — stimulus-intensity dynamism; V = 1 — 10"**®*®; adient incen- 
tive intensity. 

Vx = stimulus-intensity dynamism involved in original learning. 
Vj = stimulus-intensity dynamism which evokes the response. 

W = work involved in a response (R). 
w = weight of food incentive. 


Y 


— response cycle asymmetry; Y — 


F.-F, 
Fp + F,’ 


y = distribution of momentary behavioral oscillation (sOr). 
Z = absolute zero of reaction potential. 

A = increment. 


« inanition component of food privation drive; c = 

O' = the standard deviation. 

4* — behavioral summation. 

— = behavioral withdrawal. 

* — acquired receptor-effector connection. 

= unlearned receptor-effector connection. 

= causal relationship other than receptor-effector connection. 
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^nditioned defense reaction, 1 1 1 ff. 
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333 ff. 


Diminishing returns, law of, 335 
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of, 189; in maze chaining, 172 ff., 
175 ff., 179, 180 ff.; and latency of 
response, 162, 163 ff., 167 ff.; length of 
chain in determining the difficulty of 
learning heterogeneous, 181 ff.; loco- 
motor “insight” in, 309 ff.; and pro- 
prioceptive stimuli, 158; serial rein- 
forcement and homogeneous, 167 ff.; 
serial reinforcement and heterogene- 
ous, 170 ff,; simple locomotion as an 
«*aasple of homogeneous, 188 ff.; 
stimulus generalization in, 159 ff.; 
stimulus trace intensity in, 160; termi- 
nal reinforcement and heterogeneous, 
165 ff.; tenninal reinforcement and 
homogeneous, 172 ff.; trial-and-error 
learning in, 156 

Behavior functions, need of quantitative 
equations relating various, 354 

Behavior link, delay of reinforcement and 
the, 7 ff., 126 ff,; empirical validity 
of theoretical analysis of learning 
within the, 206 ff.; learning within the 
individual, 192 ff. 

Behavioral oscillation, and adient-abient 
competition, 249 ff.; and adlent-adient 
competition, 228; asynchronism in, 
11 ff,, 235; changes in the concept of, 
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tional behavior, 217 ff.; of reaction 
potential in trial-and-error learning, 
20; of secondary reinforcement, 197; 
of the effects of experimental extinction, 

_ » 113 ff.; on a black-white con- 
tinuum, 63; path selection and stimu- 
lus, 263; perseverative, 185; response 
Response generalization); response- 
‘ntemity, 199 ff., 204 ff, 206 ff.; role 
ot the drive stimulus in, 124 ff., 138; 
spatial, 271; stimulus (tee Stimulus 
the goal stimulus in, 

■! the stimulus trace as a con- 
tinuum for, 104 ff., 350 
Generalization gradient, changes in the 
eepness of, 68 ff.; determination of the 
exponent of, 97; and dberimination 
aming, 69 ff_. dberimination learning 
0 a P«t-dbcrimination, 61-64; db- 
^jmination learning and a theoretical, 
75'ff" ^ *^'‘ee-dbcriminanda problem, 
64 ff stimuli affecting the, 

’ rocchanbms mediating the. 


nicntl^ Gradient of reinforce- 

Icarr,:’ ^"*^^*P®tory turning in maze 
ff. hypothesis, 271 

tion blind-alley elimina- 

303’fr • learning, 275 ff., 

in? 207^^^°”’ anticipatory turn- 

maze learn- 

294 ff. ’ reinforcement in, 

st'Riuluj^ 124 ff. 


Gradient of reinforcement (sfe Delay of 
reinforcement), abient behavior and 
the, 221 ff.; adient behavior and the, 
220 ff.; behavior chains and the, 158 
ff.; delay in reinforcement and the, 126 
ff.; double alternation and the, 185; 
path selection and the, 257 ff., 262 ff. 

Habit -family hierarchy, defined, 257 ff.; 
example of, 256 ff.; in maze learning, 
287 ff., 304; spatial, 253 ff., 267, 289, 
303, 310 ff.; and “U”-shaped paths, 
263 ff. 

Habit formation, law of (Postulate IV), 6 
Habit strength, behavioral oscillation of, 
11 ff., 57; behavioral summation of, 8; 
behavioral withdrawal of, 8; drive in- 
tensity in the generalization of, 11; 
effect of additional practice upon, 66; 
effective, 1 1 ; fractional antedating goal 
reaction and the generalization of, 127; 
in the constitution of reaction poten- 
tial, 7 ff.; introspective reportability of, 
344; latent and manifest, 140 ff.; and 
stimulus generalization, 10 ff.; and the 
nervous system, 329; and theory of 
value, 340 ff. 

Habituation, and secondary reinforce- 
ment, 173 ff. 

Hierarchy (see Habit-family hierarchy), 
innate responses, 347 ff.; of responses, 

5, 17; of valuation, 331 

Incentive, and abient-adient behavior, 
225 ff.; and Bentham’s pleasure-pain 
hypothesb, 341 ; delay in the receipt of, 

7 ff.; effects of shifts in, 140 ff.; and 
latent learning, 140 ff.; reaction poten- 
tial as a function of delayed, 126 ff. 
Incentive motivation, in detour behavior, 

267 ff.; Postulate VII, 7 
Incentive reinforcement, in the constitu- 
tion of reaction potential, 7 ff. 

Incentiw; substances, behavioral summa- 
tion of, 9; and theory of value, 340 ff. 
Incidental stimuli, 64 ff., 91 
IndiWdual differences, and evaluatwe 
behavior, 334; in the capacity for 
insight, 316; Postulate XVII, 13; and 
problem solving, 309; and specie, 3 
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generalization of, 7 ; in detour behavior, 
266 fF.; in double drive learning, 140 ff.; 
inanition component of, 6; latent 
learning and the role of, 140 ff.; mainte- 
nance of coercion by, 337 ff.; pleasure 
and reduction of, 341; Postulate V, 

6 ff.; primary negative, 9; proper, 6; 
and secondary motivation, 6; and the 
constitution of reaction potential, 

7 ff.; and unlearned behavior, 5 
Drive intensity, generalization based 

upon, 11 

Drive reduction, and a theory of value, 
341 ; pleasure and pain as drive or, 342 
Drive stimulus, drive condition and the, 7 ; 
fractional antedating goal reactions 
and the, 124 ff.; generalization and the 
role of the, 124 ff., 138; generalization 
continua and the, 124; in problem- 
solving behavior. 308 ff.; motivation or 
reinforcement and the. 5; reduction 
and reinforcement, 152 ff. 

Dynamism («» Stimulus-intensity dyna- 
mism) 


Economics, as a science, 327 ff. 
Emotion, frustration and, 133 ff.- 
behavior theory, 355 
Emonon J ropomo, and shift! 1„ 

Of mcentive, 142 

Entelcchy, behavior theory and the, 3^ 
Equations, /. 2, 3, 5; < 5. 6: 6 7 ft 
ff. ;2. 8. 13, 14, 15. k'l7, 

S I' 25, 1 

25, 27, 12; 25, 29, 30, 31, 13- 32 - 
26; 35, 35, 35 37 40- t ; 

®.«,43;4f,44;47;45;*;S’. 

79, 45. 46, 103; 47, 131; 43 , 140- 
158; 50, 51, Ifil; S2, 163- S3 174 - 

Etf ' “2; »'2 

tthics, a natural scitnen of, 338 fT ’ 
Evaluative behavior, 333 
ErfuUon, adaptivn bnhavior and otga, 

^T."ndf ''“6. K 

nnd foranght, 1 51 ff., , 

« n funntlon 
TuH .h" >2: dnla, l.ami 

113 ff, dilTcrtnUal msistanen to, 65 


disinhibition of effects of, 302 ff., 304; 
and emotion, 134; and fractional 
antedating goal reactions, 135; gen- 
eralization of the effects of, 54; in maze 
learning, 300 ff., 304 ff.; in separate- 
discriminanda presentation discrimina- 
tion learning, 60 ff.; in insightful 
assembly of tool-using segments, 322 ff.; 
and incidental stimuli, 67; and increas- 
ing primary drive, 114; and inhibitory 
potential (Postulate IX), 9 ff.; and 
insightful behavior, 314 ff.; and nega- 
tive expectancy, 150; and partial rein- 
forcement, 120 ff., 134 ff.; and ratio of 
reinforcement, 134 ff.; and spontane- 
ous recovery, 17; and valuation, 330 

External inhibition, 20 

Fear, as an incipient Bight reaction, 337 

Feeling, as Urban’s value concept, 341 ff. 

Field theory, and behavior in relation to 
objects in space, 250, 267, 269, 271 ff. 
Foreknowledge, 151 ff. 

Foresight, 151 ff. 

Fractional action phases, within the 
behavior link, 193 

Fractional antedating goal reactions, 

124 ff.; and adaptive behavior, 350; as 
a function of the stimulus trace, 127; 
as a secondary reinforcing agent, 14, 

125 ff., and delay in reinforcement, 150; 
and double-drive learning, 136 ff.j and 
drive stimuli, 124 ff.; and generaliza- 
tion of habit strength, 128; and insight- 
ful behavior, 311, 321 ff.; and latent 
learning, 148 ff.; and path selection, 
259; and resistance to extinction, 135; 
and stimulus-intensity generalization, 
315; the role of stimulus-intensity 
dynamism in, 131 

Fractional goal stimulus, and the condi- 
tioning of inhibition, 133 ff. 

Frustration, of an anticipation, 133 ff.; 
and revenal learning, 115 ff. 

Galvanic skin reaction, 116 ff. 

Generalization, based upon stimulus in- 
tensities, 78 ff., 84 ff.; examples of 100 
per cent, 65; fractional antedating goal 
reactions and stimulus intensity, 315; 
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in limiting behavior, 308; in maze 
learning, 178 fT.; mediating temporary 
unadaptive behavior, 349; of abient or 
adient behavior, 217 IT.; of drive, 7; of 
fractional antedating goal reactions 
and stimulus traces, 125, 127; of habit 
strength and the fractional antedating 
goal reaction, 127; of inhibitory poten- 
tial, 11, 20, 22 ff.; of inhibitory poten- 
tial and the role of stimulus-intensity 
dynamism, 86; of inhibitory potential 
io triple discriminanda problems, 75 
ff-; of inhibitory potential in trial-and- 
error learning, 20, 23 ff.; of orienta- 
tional behavior, 217 ff.; of reaction 
potential in trial-and-crror learning, 
20; of secondary reinforcement, 197; 
of the effects of experimental extinction, 
54, 113 ff.; on a black-white con- 
tinuum, 63; path selection and stimu- 
lus, 263; perseverative, 185; response 
{«< Response generalization); response- 
intensity, 199 ff., 204 ff., 206 ff.; role 
of the drive stimulus In, 124 ff., 138; 
spatial, 271; stimulus (w Stimulus 
generalization); the goal stimulus in, 
308 ff.; the stimulus trace as a con- 
tinuum for, 104 ff., 350 
Generalization gradient, changes in the 
steepness of, 68 ff.; determination of the 
«poncnt of, 97; and discrimination 
learning, 69 ff.; discrimination learning 
und a post-dberimination, 61-64; db- 
Crimination learning and a theoretical, 
70; in a threc-discriminanda problem, 
75 ff.; incidental stimuli affecting the, 
64 ff., 70; mcchanbms mediating the, 
69 ff. 

Goal gradient (see Gradient of rcinforce- 
™cnt), anticipatory turning in 
learning and the, 298; hypothesis, Z7J 
275 ff.; in maze blind-alley elimin* 
tion, 280 ff.; in maze learning, 275 
503 ff.; index, 294; spatial, 275 » 

temporal, 276 ff. 

Goal orientation, and anticipatory 
fng, 297; index, 292; and maze lea 

fng. 287 ff.; secondary rcinforccmcn » 

294 ff. 

Goal stimulus, 124 ff. 


Gradient of reinforcement (ite Delay of 
reinforcement), abient behavior and 
the, 221 ff.; adient behavior and the, 
220 ff.; behavior chains and the, 158 
ff.; delay in reinforcement and the, 126 
ff.: double alternation and the, 185; 
path selection and the, 257 ff., 262 ff. 

Habit-family hierarchy, defined, 257 ff.; 
example of, 256 ff.; in maze leammg, 
287 ff., 304; spatial, 253 ff, 267, 28 , 
303, 310 ff; and ‘‘U”-shaped paths. 

HabU ftmation, law of 
Habit ttreagth, behavioral ° ; 

tl ff 57- behavioral summation of, 8, 
^haioral withdrawal of. 8; drive tn- 
, amity in the general.aat.oo of, II, 
X. of additional praetiee upon, 66 

Sve, Ui ontedaong goal 

reaction and the generalization of, 127, 

Ha''biiaUon“,''and secondary relnfor^- 

“'"'’w te Habit-famuy hierarchy). 
Hierarchy U _ ^ responses, 

innate response, 54/ n , 

5 17; of valuation, ooi 

. and abient-adient behavior. 

'TzS mi Beatham’s pleasur^p^n 
- ^41 • delay in the receipt of, 

^ 140 ff.; reaction poten- 

laS^tW motivation, in detour behavtor. 

S ff- Postulate VII, 7 
. relt^e reinforcement, in the commu- 
nion of reaction potential. 7 fh 
* ■ ..thstances. behavioral summa- 

'"'"n79lnd<h~ryofvalue,340 fr. 

liidi^idual dilferenees. and evaluative 
‘ Savior, 334; in the capacity for 
, . 316 ; Postulate XVll, 13; and 

^blem .olving, 309; and .peeic, 3 
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generalization of, 7 ; in detour behavior, 
266 ff.; in double drive learning, 140 ff.; 
inanition component of, 6; latent 
learning and the role of, 140 ff.; mainte- 
nance of coercion by, 337 ff.; pleasure 
and reduction of, 341; Postulate V, 

6 ff.; primary negative, 9; proper, 6; 
and secondary motivation, 6; and the 
constitution of reaction potential, 

7 ff.; and unlearned behavior, 5 
Drive intensity, generalization based 

upon, 11 

Drive reduction, and a theory of value, 
341; pleasure and pain as drive or, 342 
Drive stimulus, drive condition and the, 7; 
fractional antedating goal reactions 
and the, 124 ff.; generalization and the 
role of the, 124 ff., 138; generalization 
continua and the, 124; in problem- 
solving behavior, 308 ff. ; motivation or 
reinforcement and the, 5; reduction 
and reinforcement, 152 ff. 

Dynamism (j«« Stimulus-intensity dyna- 
mism) 


Economics, as a science, 327 ff. 

Emotion, frustration and, 133 ff.. Iq 
behavior theory, 355 *’ 

Emodonal respomes, and shifa in amount 
of incentive, 142 

Emeltthy, bohavior thtory and tha 347 
EquaUons, I, 2, 3, 5; 4, ^ 

’0, 11, 12, 8| 13, 14, IS, IS, 17 s’ 
ra. 1% 20, 10: 21, 22, 23, 24, 25, II- 
S’ « *’■ 2'’; 

m «.• 23, 42- 

27, 131; 43^ ,45, 45 
'4®^ 2’. «l; 52, 163; 53, 174- 54 

«. 220; 56, 238;37, 248; 55, 292; », 294 
Ethics, a natural tdcnce of, 338 ff. 
Evaluative behavior, 333 
Ei^udon, adapUve behavior and orffanio 

Eapctirncntal extinction, ai a tuneffm o'! 

return poreoff^, , 

it3ff diff of' 
U3 ff.. diffcTenual resUtance to, 69 ff.’ 


disinhibition of effects of, 302 ff., 304; 
and emotion, 134; and fractional 
antedating goal reactions, 135; gen- 
eralization of the effects of, 54; in maze 
learning, 300 ff., 304 ff.; in separate- 
discriminanda presentation discrimina- 
tion learning, 60 ff.; in insightful 
assembly of tool-using segments, 322 ff.; 
and incidental stimuli, 67; and increas- 
ing primary drive, 114; and inhibitory 
potential (Postulate IX), 9 ff.; and 
insightful behavior, 314 ff.; and nega- 
tive expectancy, 150; and partial rein- 
forcement, 120 ff., 134 ff.; and ratio of 
reiidbrcement, 134 ff.; and spontane- 
ous recovery, 17; and valuation, 330 

External inhibition, 20 

Fear, as an incipient flight reaction, 337 

Feeling, as Urban’s value concept, 341 ff. 

Field theory, and behavior in relation to 
objccu in space, 250, 267, 269, 271 ff. 
Forelcnowledge, 151 ff. 

Foresight, 151 ff, 

Fractional action phases, within the 
behavior link, 193 

Fractional antedating goal reactions, 

124 ff.; and adaptive behavior, 350; as 
a function of the stimulus trace, 127; 
as a secondary reinforcing agent, 14, 

125 ff., and delay in reinforcement, 150; 
and double-drive learning, 136 ff., and 
drive stimuli, 124 ff.; and generaliza- 
tion of habit strength, 128; and insight- 
ful behavior, 311, 321 ff.; and latent 
learning, 148 ff.; and path selection, 
259; and resistance to extinction, 135; 
and stimulus-intensity generalization, 
315; the role of stimulus-intensity 
dynamism in, 131 

Fractional goal stimulus, and the condi- 
tioning of inhibition, 133 ff. 

Frustration, of an anticipation, 133 ff-! 
and reversal learning, 115 ff. 

Galvanic skin reaction, 116 ff. 

Generalization, based upon stimulus in- 
tensities, 78 ff., 84 ff.; examples of lOO 
per cent, 65; fractional antedating goal 
reactions and stimulus intensity, 315; 
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orientation, 287 ff.; habit-family hier- 
archy in, 287 ff., 304; and incentive 
shifts, 140 ff.; and maze chaining, 172 
ff., 175 ff,, 180 ff.; multidirectioneJ, 275 
ff.; problem solving in, 309 ff.; purc- 
stimulus acts in, 288; reaction time in, 
286 ff.; spontaneous recovery in, 302 
ff.; and the goal-gradient principle, 
278 ff. ^ ’ 

Moral judgment, 337 ff. 

Motivation {see Drive), delay in reinforce- 
ment and the role of, 133; incentive, 7; 
and law of diminishing marginal 
utUity, 333 ff_. primary {see Primary 
motivation); secondary, 6 


NtedW, and anxiety. 341; in energizing 
tendencies to action, 18; and pleasure- 
patn 340 ff.; related to valuation, 329 
the fractional goal stimulus in differ- 
muaung among, 125; and unlearned 
DChavior, 5 

“ ““■'ifitation of the role 
. 02 ff.; and a theory of value, 340 ff. 


^^Uff **’^'^*' behavior in relation to, 
°«itoion (r« Behavioral osemation) 


rTh concept of, 340 ff. 

^seleet.on,226ff., 256ff. 

217!rt,^^^’ ceceptor adjustme 
207 tespomes, in rote learn 

tr^'ttve ttimulus trace (rre Stum 
Pleasure nr w * 

drive, 342 ^ reduction 

vi'vn ^ 

'» 0 XI "X. 9 ff.; 

xiv 'j^'' 11 ff.; XIII. 

Prtn,;^=^V;XVI,xvn,I3 

'Oticc), i 

'■otulai V 6 m ■"> ' 

,^ha„Bss’in,£;^“"‘“'’P°'“ 
tetnforcement. Postulate III, 5 


Problem solving, ability and statistical 
methodology, 309 ff.; assembly of be- 
havior segments in, 308 ff.; delay in 
reinforcement in stick, 320; in maze 
learning, 276 ff.; involving three habit 
segments, 315 ff; of non-speaking 
organisms, 308 ff.; and tool use, 320 
Proprioceptive stimuli, and behavior 
chains, 158 

Purc-stimulus act, antedating goal re- 
sponses as examples of the, 151; fore- 
sight and the, 151 ff.; in maze learning, 
288; in valuative behavior, 337, 343; 
language, evaluative behavior and the, 
332; the study of speech movement as 
a, 355 

Purpose, 151 ff. 

Purposive behavior, 152 

Reaction (see Response), adaptive, 215, 
347 ff.; antedating (see Antedating 
responses); defense, 110 ff., 118, 119; 
galvanic skin, 13 

Reaction amplitude, and reaction poten- 
tial, 13 

Reaction potential, absolute zero of, 12; 
adience-abience and generalized, 220; 
as a function of delay in reinforcement, 
126 ff.; as a function of incentive, 225; 
as a function of j.n.d. differences, 72 ff.; 
asafunctioo of latency of reaction, 13; 
as a function of reaction amplitude, 13; 
asymptote of, 8; asynchronism of the 
oscillations of, 12, 235; and beharioral 
oscillation, 11 ff.; behavioral summa- 
tion of, 8, 161, 162; behavioral with- 
drawal of, 9; and changes in primary 
motivation, 225; and delay in reinforce- 
ment, 7 ff.; drive intensity in the 
generalization of, 11; generalization of, 

10 ff.; and generalization gradients 
(see Generalization gradient); incentive 
component of, 7; and incentive shifts, 

140 ff.; incidental, 65, 67; introspective 
reportabillty of, 344; momentary, 12 
ff,; net discriminatory, 72 ff., 75 ff., 84 
ff., 204 ff.; reaction threshold and 
momentary, 12 ff.; stimulus intensity as 
a basis of generalization of, 78 ff., 84 ff.; 
superthreshold, 9, 92; the constitution 
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Inhibition, and amount of work, 202 ff.; 
dissipation of reactive, 36; external, 20; 
and learning within the individual be- 
havior link, 194 ff.; of delay, 113; the 
fractional goal stimulus and condition* 
ing of, 133 ff. 

Inhibitory aggregate, 9, 36 
Inhibitory potential (see Conditioned in- 
hibition), as a function of the number 
of responses, 10; behavioral summation 
of, 75 ff.; behavioral withdrawal of, 
70 ff.; 77 ff.; and frustration, 133 ff.; 
generalization of, 11, 22 ff.; gradient 
of, 70; Postulate IX, 9 ff.; role of 
stimulus-intensity dynamism in the 
generalization of, 86; stimulus-intensity 
gcneralizaUon of, 205; trial-and-error 
le^ing and the generalization of, 20; 
triple discriminanda problems and the 
generalization of, 75 ff. 

Innate reaction tendency, and organic 
evolution, 18. 19, 348 
Innate response hierarchy, S, 347 ff 
Insight, uscmbly ot tool-using bnhavior 
segments and a theory of, 321 ff.; ex- 
evidence of, 317i indivldusl 
Giffmnces In the capacity for, 316- 
mediated by qualitative stimulus gen’ 
m ff-; problem of llo- 

aTia l ' ‘T enneraliea. 

317’fr .1 ■Pnnt.neou, tocl-toe, 

J17 ff. theoteueal diagrammeiic reorel 
■enuUon of, 312 'ictepre- 

jmightful leanibg, theory of, 310 If. 
Intemily («. Sdmulua intensity) 

“Zr: AlTen», 

Stimulus interaction) 

Interest, and value theory, 343 
introspection, 344 ff. 

j-n-d., scale of brightness, 63 

Latency of reaction, and j- 

eradieni. g'nersli,a,io„ 

r-adienli, 68; m edient-edient be. 


havior, 230 ff., 265; and insightful 
behavior, 313; and reaction potential, 
13; and response strength, 17; and the 
galvanic skin reflex, 117 

Latent learning, current aspects of the 
problem of, 148 ff.; in theoretical per- 
spective, 140 ff. 

Law(8), introspection in the formulation 
of, 344; methodology of validating 
natural-science, 338 ff.; natural-science 
and behavior, 333 ff.; of diminishing 
marginal utility, 333 ff.; of diminishing 
returns, 335; of habit formation, 6; of 
mammalian behavior, 1; of social 
behavior, 2; of value, 329 ff.; primary 
molar behavioral, 13; symbolic con- 
structs and formulation of natural, 345 

Learning, adient-abient {see Adient- 
abient behavior); based upon corre- 
lated reinforcement intensities, 196 ff.; 
chain {see Behavior chains); compound 
trial-and-error {see Behavior chains), 
discrimination, 59 ff. {see also Dis- 
crimination learning); double drive, 
136 ff.; latent, 140 ff.; 148 ff.; maae {t«* 
Maze learning); of novel acts, 209 ff.» 
214; of the conditioned defense reac- 
tion, 111 ff.; rote, 117, 183; theoretical 
analysis of reversal, 114ff., 120 ff.; trial- 
and-error {see Trial-and-error Icarn- 
ing); within the individual behavior 
link, 192 ff. 

Maze blind-alley elimination, 280 ff.; and 
the goal gradient, 303 

Maze blind-alley entrances, as a function 
of the distance to the goal, 293 ff-. 
depth of penetration in, 298 ff.; experi- 
mental extinction of, 300 ff.; and goal 
direction, 289 ff. 

Maze learning, and adience-abience, 270 
ff.; anticipatory turning and the per- 
severative stimulus trace in, 296 ff., 
anticipatory turning and stimulus 
generalization in, 296 ff.; anticipatory 
turning and the goal gradient in, 298; 
anticipatory turning in, 296 ff.; blind- 
elimination in, 280 ff.; experi- 
tncntal extinction in, 302 ff., 305; 
generalization in, 178 ff.; and goal 
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6; and fractional antedating goal 
reaction, 14, 125 fF.; generalization of, 
197; and goal-pointing paths, 304; and 
habituation, 173 ff.; in discrimination 
learning, 67; in goal orientation, 294 
ff.; in latent learning, 147 ff.; and 
insightful assembly of tool-using be- 
havior segments, 324; and insightful 
behavior, 314; and the antedating goal 
reaction, 150, 314 ; and the fractional 
goal stimulus, 128; theoretical truth 
value based upon, 335 ff.; and valua- 
tion, 330 ff. 

Serial reinforcement, defined, 156 ff.; and 
heterogeneous response chains, 170 ff.; 
m hcterc^eneous linear ma^e chaining, 
180 ff.; in homogeneous linear maze 
chaining, 179 ff.; in homogeneous 
responaechains, 167ff, 

Shilled behavior, 192 ff. 

Social behavior, flight reactions as 
examples of, 337; systematization of, 
335 ff.; valuation in, 336 ff- 
SpaUal habit-family hierarchy, 267, 289, 
303, 310 ff. 

Spontaneous recovery, in delay learning, 
114; in maze learning, 301 ff.; recur- 
rence of extinguished responses rcsult- 
“g from, 17 ; and response alternation. 


^muli, discrimination learning and the 
64 ff.; incidental, 
, ''_*^8htful assembly of tool-using 

322^^°^**^^"^ and antedating goal. 


milin, «,uivalnicc, 59 (j« clso Stimu- 
sracraliiaiion); fractional goal, 124 
^-Itcniing, 115 , 255, 270 IT.; rcccp- 
^ ’ recondary raotiv'ation based 

selection and simple 

Sracraliaation, and abient- 

kjvio, ‘".""'"on, 94, 96; and bc- 
h’it'cl’f.ti *n mediating 

^ ^Ifm-solving behavior, 320; 


and incidental stimuli, 64 ff.; and 
orientational behavior, 218; and path 
selection, 263; Postulate X, 10 ff.; and 
rote learning, 117; and the defense 
withdrawal reaction, 348; and valu- 
ation, 330 

Stimulus intensity, discrimination learn- 
ing, 69 ff., 84 ff,, 87 ff.; discrimination 
learning based upon objective, 87 ff.; 
generalization, 11, 78 ff., 84 ff.; gen- 
eralization and the fractional ante- 
dating goal response, 315; generaliza- 
tion and habit-family hierarchy, 257 
ff.; geneiaUzatioa and response, gen- 
eralization, 204 ff.; and space per- 
ception, 217; stimulus-intensity dy- 
namism as a function of the, 7 

Stimulus-intensity dynamism, and adJent 
behavior, 222; as a constituent of reac- 
tion potential, 7 ff.; and conditions of 
learning versus evocation, 102, 128; 
fractional antedating goal reactions and 
the role of, 131; and generalization of 
inhibitory potential, 86; and general- 
ization of reaction potential, 78 ff.; in 
stimulus-intensity discrimination learn- 
ing, 84 ff.; Postulate VI, 7; and stimulus 
generalization, 1 1 ; and the molar 
stimulus trace, 101; and the stimulus 
trace as a generalization continuum, 
104 ff. 

Stimulus-response coimections, diagram- 
matic representation of, 59, 92 

Stimulus trace, and afferent stimulus 
imcraction, 109, 115; and antedating 
reactions, 108, 110 ff., 350; anticipatory 
turning in maze learning and the per- 
severative, 296, 303; as a generalization 
continuum, 104 ff., 350; behavior and 
the molar, 100 ff.; behavior chaining 
based upon the, 159 ff.; comparison 
phenomena dependent upon, 93; de- 
fined, 100; delay of reinforcement and 
the, 126 ff.; delay learning and the, 112 
ff.; derivation of the postulate on the, 
101 ff.; distributed versus massed learn- 
ing and the, 36; experimental validity 
of the theorems relating to the, 116 ff.; 
fractional antedating goal reactions as 
a function of the, 259; generalization 
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of, 7 ff.; the interaction of two field 
gradients of adient, 227 ff., 269 ff.; the 
stimulus trace as a basis for generaliza- 
tion of, 104 ff.; and theory of value, 
340 ff.; value represented by, 329 ff. 
Reaction potentials, competition of, 12 ff., 
20 ff., 30 ff., Ill, 331 ff. 

Reaction threshold, and absolute zero of 
reaction potential, 12 ff.; and delay in 
reinforcement, 8; drive and the, 7; and 
momentary reaction potential, 12 ff.; 
and response alternation, 57 
Reasoning, abstract, 355 ff. 

Receptor adjustment acts, 92 ff., 96 216 
ff., 255, 335 


Reinforcement, all-or-none type of, 193 
ff., 195, 196, 212 ff.; antedating re- 
spons« and the time of, lid; cessation 
of pain as, 112; correlated, 200; cri- 
tcnon of, 1 52 ff. ; delay learning and the 
role of, 112 ff.; delay of {see Delay of 
rejmorcement); discrimination and the 
ratio of, 67; and drive stimuli, 124; ex- 
^rimenul extinction and partial, 120 
n.; experimental extinction and the 

rauo of. 135 ff.. 

terminal, 158 ff.; gradient of delay in 
reinforcement and the amount of. 132 
ff.; ^adient of serial, 167 ff., nO; and 

“otivation, mddcntal non-, 63: law 

POMoc,™, nation gonaralLaai.on gra! 
5 IT., accondary (,„ Sacondiy^: 

■>»"> in panial, 134 f 
and v^alue, 329 ff 

Reminbcence, 37 ff. 

Responsefs) {,„ Reaction! 

33 co„t,ao„o„.„, 

tr., an.o,.o„al, 142; avooadon ami 


reaction potential, 7; hierarchy of, 17< 
(see also Habit-family hierarchy); in- 
nate, 5, 19, 347 ff.; latency of reaction 
as an indication of strength of, 17; order 
of occurrence of, 17; relations type of, 
94, 96; repetition of erroneous, 17; rote 
learning and perseverative, 207; short- 
circuiting of, 111, 173; the stimulus 
trace and the antedating of, 108, HO 
ff., 350 

Response alternation, charactcrbtics of 
trial-and-crror learning, 184 ff.; com- 
parison of theoretical with empirical 
phenomena of, 49 ff.; and competition 
between reaction potentials, 41 ff., 55, 
56; and spontaneous recovery, 57 

Response alternation cycles, historical 
note on, 56 

Response chains (see Behavior chains) 

Response cycle, asymmetry of, 42 ff. 

Response generalization and behavioral 
oscillation, 12, 199; Corollary xiii, 12; 
definition and example of, 200; in tool- 
using behavior, 320 ff.; and insightful 
behavior, 323 ff.; locomotion as an 
example of, 217 ff.; and response in- 
tensity, 204 ff. 

Response intensity, as a function of in- 
centive, 141; generalization, 199 ff.» 
204 ff., 206 ff.; learning within the 
behavior link, 193 ff. 

Response oscillation, and contraction 
intensity, 198 

Response selection, and trial-and-crror 
learning, 60 

Reversal learning, theoretical analysis of, 
114 ff., 120 ff. 

Rote learning, 117, 183 

Satiation, and law of diminishing mar- 
ginal utility, 333 

Sdence, growth of empirical behavior, 
353 ff.; of molar behavior, 2 ff., 234; 
and moral judgment, 339; of ethics, 
338 ff.; theory of value and natural-, 
340 ff.; truth and natural-, 336 ff. 

Secondary motivation {see Drive) 

Secondary reinforcement {see Reinforce- 
jucni), affecting the gradient of delay 
in reinforcement, 132 ff.; Corollary ii, 
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of the fractional antedating goal rcae> 
tion and the, 126 fT.; its role in adaptive 
behavior, 350 ff.; partial reinforcement 
and the, 120 ff.; reversal learning and 
the, 114 ff., 118 ff.; secondary reinforce- 
ment and the, 14; stimulus generaliza- 
tion based upon the, 159 ff.; stimulus- 
intensity dynamism and the, 7 ff., 101; 
stimulus reception and the, 5; tentative 
theorems regarding the, 107 ff. 

Symbolic logic, and behavior science 
development, 353 

Systematic behavior development, quali- 
tative and quantitative aspects of, 
353 ff. 


Terminal reinforeement, defined, 156; 
and double alternation, 184 ff.; and 
heterogeneous response chains, 165 ff.; 
in heterogeneous linear maze chaining, 
175 ff.; in homogeneous linear maze 
chaining, 172 ff.; and simple locomo- 
tion, 188 ff. 

Theory, future challenges to behavior, 
353 ff.; symbolic constructs and 
naturaUscience, 328; the test of a 
sound, 351 ff. 

Threshold, reaction («< Reaction thres- 
hold) 

'^' 317 'ff’ spontaneous. 


Trace, stimulus (/« Stimulus trace) 
Trial-and-error, vicarious, 93 
Trial-and-error learning, and adapUve 
behavior, 348 ff.; additional forms 


56; an example of, 15 ff.; and behavior 
chains, 156; behavioral oscillation and 
locomotor, 255; by continuous trials, 
38 ff.; by massed trials, 36 ff.; definition 
of and example of, 20 ff.; differentiated 
from discrimination learning, 59 ff.; 
joint-stimulus presentation discrimina- 
tory, 92 ff.; quantitative assumptions 
in, 20 ff.; and response alternation, 
41-53; single stimulus presentation and 
discriminatory, 90 ff.; and space 
perception, 217; theoretical analysis of, 
16 ff. 

Valence, 267 

Validation, rquirements for, 338 ff. 

Value, valuation, and behavior theory, 
327 ff.; dutingubhed from valuation, 
329; ethics and, 338 ff.; in choice 
behavior, 330 ff. ; interpretation of some 
theories of, 340 ff.; needs and, 329 ff.> 
objective treatment of, 327 ff. 

Valuativc behavior, consutency of, 332 
ff.; and pure-stimulus acts, 343 

Work, and amount of reward, 135; gra- 
dient, 206; habit-family hierarchy and 
the principle of less, 257 ff.; inhibitory 
potential as a function of, 10; path 
selection and the principle of less, 227; 
and secondary reinforcement, 112; and 
theory of value, 340 ff. 

Work differential, learning within the 
behavior link and the role of, 202 ff. 
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Foreword 


To the completion of this book Professor Hull gave all the energy 
that he could muster during the last three years of his life. Owing to 
declining health he was permitted to work only a few hours a day. 
Visits to his office and laboratory and attendance at scientific 
seminars and discussions were drastically curtailed. Despite these 
handicaps he won the race. On May 10, 1952, he died, with the 
satisfaction of having finished a major portion of a program of 
research and writing that he had set for himself years earlier as his 
contribution to the efforts of the Institute of Human Relations to 
develop a basic science of behavior. 

The manuscript was turned over to the Yale University Press 
upon its completion early in February 1952. The galley proofs 
were ready shortly before he died. He did not see them. They were 
read for typographical errors and inconsistencies by his assistant 
and secretary, Ruth Hays. Throughout the reading of both the 
galley and page proofs Frank A. Logan gave invaluable assistance 
in matters requiring tednnicai VnoWicdge. ttad Yro^essor Fiuii iived 
to read the galley proofs, he would no doubt have made some last 
minute changes in technical details and would have detected any 
major alterations that needed to be made. He was keenly aware 
that in this rapidly growing field no volume can long remain com- 
pletely up to date. 

The subject index and glossary of symbols were prepared by his 
research assistant, John A. Antoinetti. 

In the preface Professor Hull gives generous acknowledgment to 
all who aided him in the preparation of the manuscript. VVe, in 
turn, now desire to honor liim for his generosity in making freely 
available to all of us such a ricli rcscr\x»ir of original ideas. 

1952 Mark A. ^^AY, Director 

' Jnititute oj Human Relations 



Preface 


A decade or more ago I drew up a plan which proposed the writing 
of three volumes intended to cover in an elementary manner the 
range of ordinary mammalian behavior. This book is the second in 
that series. The first volume, Principles of Behavior (1943), was 
designed in the main to state the more important primary behavior 
principles considered necessary to mediate the deductions of a 
natural-science theory of behavior. A small supplementary volume, 
Essentials of Behavior (1951), presents these principles in a revised 
and nearly up-to-date form. The present work is intended primarily 
to show the application of the principles to the deduction of the 
simpler phenomena characterizing the behavior of single organisms. 
According to my ori^nal plan the third and final volume would 
apply these same principles to the deduction of the elementary 
phenomena of social behavior, i.e., of behavior manifested when the 
interacting objects are mammalian organisms of the same species. 
I greatly regret that in all probability I sheJl not be able to write 
the third volume. 

In the following pages I have made a serious attempt to give a 
quantitative, systematic account of some of the more important 
forms of non-social behavior. I make no pretense of having said 
the last word on any of them. I trust that the quantitative method- 
ology employed will readily make apparent to all serious students 
the errors which presumably have eluded our scrutiny and insight; 
hidden fallacies may seriously delay the advancement of a young 
science, 

I am glad to take this occasion to thank the many persons who in 
one way or another have contributed to this volume. John M. 
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Felsmger, Arthur I. Gladstone, and Harry G. Yamaguchi per- 
formed a major task in experimentally quantifying reaction 
potential (bEr). William J. Arnold, Arthur I. Gladstone, Allen J. 
Sprow, and Charles B. Woodbury made necessary empirical quan- 
titative determinations of various sorts of behavior chaining. Dr. 
Yamaguchi and John A. Antoinetti performed the computations 
upon which many of the theoretical graphs are based, and Mr. 
Antoinetti also prepared the subject index. Frank A. Logan read 
and gave an expert criticism of the entire manuscript during the 
final stages of its preparation. Frederick S. Cates made most of the 
line drawings for the various figures. In a different category are 
Professors Carl I. Hovland, Neal E. Miller, and Irvin L. Child; 
these men have given moral support and other aid whenever it was 
needed. Professor Kenneth W. Spence, through his unfailing 
interest in and understanding of the problems here discussed, and 
rough criticisms, suggestions, and relevant experiments which he 
nd his students have performed, has contributed to a degree 
feebly expressed by these few lines. In still another category^re 

Tone eKperim’ Tu -"e. who have 

hav ub eZ ! here, or who 

- It'S 

Ruth Ha^. now rotZ -Z ^ «hh to thank 

secretary, for contributingZr gZu's inT/ T""' ^ 

Sion. I thank Professor Mark A M, '^^ective scientific expres- 
Human Relations, forZlTus^nT ^"^“tute of 

difficult and sometimes discouraging tasHTTrirP?”" 

grateful to Yale University anri^.hf 1 • ^ ^ ^ 
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1 , Introductory Considerations 


Science has two essential aspects — the empirical and the explana- 
tory. The empirical aspect is primarily concerned with the facts 
of the science as revealed by observation and experiment. The 
explanatory or theoretical aspect, on the other hand, consists in a 
serious attempt to understand the facts of the science, and to inte- 
grate them into a coherent, i.e., a logical, system. From these 
observations and integrations there are derived, directly or in- 
directly, the basic laws of the science. Since in a young science a 
certain amount of uncertainty naturally surrounds these basic 
laws, especially as to whether they are really basic, i.e., primary or 
underivable, their validity is temporarily assumed. It is for this 
reason that we have here called postulates what we assume to be 
laws. Once a set of presumptively basic laws has been isolated, the 
way is opened for the development of a natural-science theoretical 
system. That is primarily the task of this volume. It will present 
in a certain amount of detail an elementary theory of behavior. 

The specific task of the present chapter is to prepare the reader 
in a rather general way for the chapters which are to follow. Per- 
haps we can best show what our purpose is by contrasting this 
volume with Principles oj Behavior, which was written earlier. That 
work was designed in the main to present the more important 
presumptive elementary laws of mammalian behavior, together 
with relevant explanatory considerations so that they would be 
provisionally understood. Because of the novelty of this approach 
to behavior theory at that time, we felt it desirable to give a num- 
ber of fairly elaborate examples of the deductive use of the prin- 
ciples in the logical derivation of secondary laws (or explanations) 

1 
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of more complex behavior phenomena. For this reason many read- 
ers of Principles of Behavior have mistakenly considered that work 
as presenting a completed system. Actually it contained merely 
some preliminary illustrative examples of what the ultimate system 
was intended to be. 


The chapters which follow Chapter 1 arc designed to set forth 
in some detail a genuine portion of the developing system — that 
portion concerned with non-social behavior. Paralleling the pre- 
sentation of the theoretical conclusions there will be given, from 
time to time, a summary of the agreements and disagreements 
between the deductions from the postulates and the corresponding 
empirical facts. In this way the reader will be reminded not only 
of the necessity of continuously checking the results of theoretical 
imphcattons, but of the current defects as well as the modest 
successes of the system as so far developed. 

Another incidental factor which may be noted in this conneetion 
live'.!™’’"’”® “P'rimentation has not yet 

mtemt,ic",h »>« fields aiready broaehed in 

Si for neil ^r''’ “ 'xp^ments continuously 

dlSv call r"" and theories eon- 

.heoremlThetJortrST“ 

Also, occasionally a principle orevin'^''™ “ 

to be in error; it is then °“dy considered true is found 

changed. But when you change''n,o,?“i',°^.''’'',’’“‘“'“"^ 

of the conclusions in the system as ** 1 " ^ system some 

written three or four years earf ® a section or so 

the different parts of a scientific wSfSe'tli*’' 

sistent with one another. If this is j “Peeted to be con- 
earlier and those containing the later d P°etions written 

tirely agree. But if publication is evS tfo'’ce°‘’T"'' 
when new developments must simnlv l, comes a time 

he trusted to later manuscript revis' ^ eecorded and consistency 
hope that all su™rsi« tiuch to 

in the present work. “ ^nod and rectified 
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In this connection it must be understood that behavior consists 
in an exceedingly complex mass of interrelations. On the other 
hand, exposition is essentially linear, and a linear presentation of 
behavior complexities inevitably distorts the reality. One of the 
most obvious distortions arising from this fact lies in over-simplifica- 
tion. To arrive at a genuine understanding of the multidimensional 
relationships of behavior is a personal achievement requiring a 
lifetime. Our effort to present some of the true richness of behavior 
reality in this exposition is seen scattered through the text in the 
frequent references to other related parts. The hurried reader may 
ignore these references, but the serious student with time for 
thought will find them helpful. 

Strictly speaking, the body of a scientific system consists of the 
mathematical derivations of the theorems which correspond to the 
empirical facts of the science. The deductions presented in this 
volume are all of a relatively simple concrete form, and are mostly 
quite informal. At one time a few of us worked out for a limited 
range of behavior a strict system to explore its possibilities (J). It 
is probably too early to do this on a large scale, though the rare 
persons qualified for such a task should before very long attempt 
to do it at least for the field covered by the present volume. 

Then there is the question of the numerical values of the differ- 
ent constants or parameters which appear in the mathematical 
derivations. Not one of these is really known at present except as 
the roughest approximation. The most conspicuous example of this 
is found in the field of individual and species differences, the values 
of which are believed to be based almost entirely on differences in 
constants. The same limitation holds in regard to the equations 
representing many of the functions stated in the postulates. Natu- 
rally the lack of this knowledge places great limitations on the 
range of theoretical predictions and on comparisons with empirical 
fact. Sometimes, otherwise significant potential theorems are not 
even mentioned because of the magnitude of these uncertainties. 
In other situations, however, where the probability seems to favor 
a given outcome a theoretical interpretation is attempted mainly 
as a means of calling attention to the problem and our general 
approach to the solution. 

But clearly, before he can follow intelligently the deductions of 
the behavior phenomena presently to be discussed, the reader must 



A BEHAVIOR SYSTEM 


know the substance of the postulates upon which the reasonings 
are based. Whether he has some familiarity with the system as a 
whole or whether he is coming to it for the first time, he probably 
will have occasion to return to these principles more than once 
for thoughtful scrutiny; their implications are no more obvious 
at first sight than are those of the axioms of Euclid. For this reason 


the postulates are assembled in the present chapter, and to facili- 
tate easy reference and identification they arc listed in sequence, 
each postulate being given an upper-case Roman numeral. 

Also assembled in the present chapter are the formulations of 
certain major implications of the postulates, here called corollaries. 
These are placed in sequence with the postulates, each corollary 
following the postulate upon which it mainly depends; they are 
identified by lower-case Roman numerals. We shall find some of 
these corollaries to be of considerable use in the deductions of 
certain theorems which will appear in the body of this work. 

I he reasoning underlying the formulations of most of the postu- 
lates and corollaries has been published previously (7; 2h the 

The 

the referenrp ^ \ ogical considerations by consulting either 

inV eh nt rs I m be found in the follow- 

bcL baCd on re L .these postulates have 

rat, in the belief thaf th k u • o^'ganisms, particularly the 

ing : ttmepl"'' 

late to a major corollary (iii^ aIvT v from a postu- 

principle is expressible as a substance of a 

latter is now gfvenas'the tetSh *■= 

poses of convenient reference to rl . fPPtf^tmation. For pur- 
important equations presented ‘““tification of the more 


important equations presented rt.™ c ‘'^“‘‘fication of the more 
chapters, a sequential number ‘^c following 

panics each, on the right-hand i Parenthp.k 


1 parenthesis accom- 
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A ^ossary of signs as used in this volume vdll be found on pages 
357 fi". For the most part these symbols are the same as those used 
in Principles oj Bekavior and Essentials of Beksvior, though there are a 
few changes, additions, and vdifadrav/als. 

Here foUov/ the behavior postulates and major corollaries, set 
up in bold-faced typt and italics, respectively, to distinguish them 
clearly from the body of the text. 

Posfulcte I- Unleomed Slimylowesponse Ccnnectjons (sXSz) {1, p./7;2,p,l) 
Orgcmsms ct birth possess receplor-efrector conneciioTw (s\Sz) vdilch 
under ccrnblned sfimoIcfJcn (S) end drive (D) heve the poferrticlity of 
evoking o hlercrchy of responses thet either individuclly or in combinc- 
fion ore more fikeiy to feminefe a reed then would be a random 
selection fron the recefions resulting from ether stinruTus end drive 
combinotions. 

Postulate n. Stimulus P.ecepticn (S end s)* (2, pp- 7 <t.) 

A. V/hen a brief stimulus (S) impinges upon o suitoble receptor there 
is initicted the recruitment phese cf c self-propogcting mclcr efferent 
frcce impulse (s')/ the molcf stimulus equivalent (S') of which rises as 
a power function cf time (t) since the beginning of the stimulus, i.e., 

S' /65J90 X !' «*” -r hO, (1) 

S' rcochlng Its moximvm (end ferminolion) when t equols cbeut ^50". 
B- Following the mextmum of the recruitment phase cf the moler 
stimulus trcce, there supervenes a more lengthy subsident phase (s')/ 
the stimulus equivolent of which descends cs a power function of time 

(H/ he-/ 

S' = d.93I0(f' -r J)l)-*'^“, (2) 

where f « f - ^50". 

C The irrferwity cf the molar stimulus froce (sO is c logcriftmlc function 
of the molar stimulus equivcienf of the frcce, i.e., 

s' « log S'. (3) 

Posfulote III. Primary Peinfercement (2, pp. 15 ff.) 

V^enever cn effector cctrvity (R) is closely cssocicted with e stimutus 
efferent impulse or trcce (s) end the eo^ivnetion is closely cssocicted 
* For the deriratisn c/ PortuijtJ' II Chapin" 4, pp. 101 Ei 



A BEHAVIOR SYSTEM 


with the rapid diminution m the motivational stimulus (Sd or Sa)» there 
will result an increment (A) to a tendency for that stimulus to evoke 
that response. 

Corollary i. Secondary Motivation (2, pp. 21 fT.) 

When neutral stimuli are repeatedly and consistently associated with 
the evocation oj a primary or secondary drive and this drive stimulus 
undergoes an abrupt diminution^ the hitherto neutral stimuli acquire 
Iht capacity to bring about the drive stimuli (S„), which thereby become 
the condition (Cd) oJ a secondary drive or motivation. 

Corollary ii. Secondary Ileinjar cement (2, p'p. 26 fT.) 

A neutral receptor impulse which occurs repeatedly and consistently 
in close conjunction with a reinjorcing state oj ajjairs, whether primary 
or secondary, mill itself acfjuire the power of acting as a reinforcing 


P=.tul=l. IV. The Uw of Habit Fonootion (,Hn) (F, pp. 102 ff., 2, pp. 22 fF.) 

intervals, 

os a positive o “77’' "" '''"S"' 

bHr b 1 — 

Pos.v,o,e V. PH™., Motivotion or Drive (D, (F. pp, ,,, 33 

-s^:; 

which is an increasing monotonic 

hours of food privation; and ( 2 ) number of 

which is a positively acceiern»«ej ‘nanltlon component (e) 

from 1.0 to zero, i.e , of h decreasing 


where 


D - D' X ,. 


and 


= 37.824 X 10-*T 4SS 


^ + 4 . 001 , 


- . 0 (X 101045 h’ <ss. 
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B. The functional relationship of drive (D) to one drive condition (food 
privation) is: during the time from h — 0 to about h = 3, drive rises 
in o lineor manner until the function abruptly shifts to a near hori- 
zontal, then to a concave-upward course, gradually changing to a 
convex-upward course reaching a maximum of at about h = 59, 
after which it gradually falls to the reaction threshold (gLn) at around 
h = 100. 

C. Each drive condition (Cd) generates a characteristic drive stimulus 
(Sd) which is a monotonic increasing function of this state. 

D. At least some drive conditions tend partially to motivate into action 
habits which have been set up on the basis of different drive conditions. 

Postuiote VI. Stimulus-intensity Dynamism (V) {2, pp. 41 ff.) 

Other things constant, the magnitude of the stimulus-intensity component 
(V) of reaction potential (sEn) is o monotonlc increasing logarithmic 
function of S, i.e., 

V =1 (6) 

Postulate Vll. Incentive Motivotion (K) (f, pp. 124 ff.; 2, pp, 47 ff.) 

The incentive component (K) of reaction potentiol (sEr) is a negatively 
accelerated increasing monotonic function of the weight (w) of food 
or quantity of other incentive (K') given as reinforcement, i.e., 

K = 1 - 10-"^. (7) 

Postulate VIII. The Constitution of Reoction Potentiol (bEr) (/, pp. 178 ff.; 
2, pp. 57 ff.) 

The reaction potential (bEr) of a bit of learned behavior at any given 
stage of learning, where conditions are constant throughout learning 
and response-evocation, is determined (1) by the drive (D) operating 
during the leorning process multiplied (2) by the dynamism of the 
signaling stimulus trace (Vt), (3) by the incentive reinforcement (K), 
and (4) by the habit strength (bHr), i.e., 

bEr » D X V, X K X bHr. (8) 

Corollary Hi. Delay in Reinforcement (J) (7, pp. 135 fT.; 2, pp. 52 fT.)* 
A. The greater the delay in r«n/brccment of a link within a given 
behavior chain, learning and response-evocation conditions remaining 
*The derivation of this corollary is presented In Chapter 5, pp. 126 fT. 



A BEHAVIOR SYSTEM 


with the rapid diminution in the motivotionol stimulus (So or so)» there 
will result an increment (A) to a tendency for that stimulus to evoke 
that response. 

Corollary i. Secondary Moiivaiion (2, pp. 21 ff.) 

When neutral stimuli are repeatedly and consistently associated with 
the evocation oj a primary or secondary drive and this drive stimulus 
undergoes an abrupt diminution^ the hitherto neutral stimuli acquire 
the capacity to bring about the drive stimuli (So), which thereby become 
the condition (Cd) oJ a secondary drive or motivation. 

Corollary ii. Secondary Reinforcement (2, p*p. 26 ff.) 

A neutral receptor impulse which occurs repeatedly and consistently 
tn close conjunction with a reinforcing state of affairs, whether primary 
or secondary, will itself acquire the power of acting as a reinforcing 
agent. 


Postulats IV. The law of Habit Fomiolion (,Hb) (J, pp. lOJ ff., 2, pp. 29 ff.) 

If reinforcements tallow each other ot evenly distributed Intervals, 
•verythina else constant, the resultina habit will increase in strength 

=s c positive growth function of the number of trials according to the 
eauation. ** 


bHr = 1 - (4) 

where N is the total number of reinforcements from Z. 

Postulate V. Primary Motivation or Drive (D) (,, pp, 2,4 ff., g, 33 ff., 

is on increa^lm™"!’”"'"'" <=>') 

hours of food privation, „„d 

which is a positivelv n,- i ^ ®gative or monition component («) 
from 1.0 to ze^iZ of h decreasing 


where 


D'Xe. 


( 5 ) 


and 


= 37.824 X 


€ » 1 


— .00001045h*-**«. 
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CoToUary vii. The Withdrawal (— ) of Reaction Potential (2, pp. 68 ff.) 
If a smaller reaction potential (aEi) is to be withdrawn (~) from a 
larger reaction potential (C), the result will be: 


G 


bEb 


M(C - bEQ 
M — bEe 


(13) 


Corollary viii. The Problem of the Behavioral Summation (+) of Incentive 
Substances (K) (2, pp. 70 ff.) 

If two incentive substances^ t and a, have A -y/w and B Vm as the 
exponential components of their respective functional equations, the 
second substance will combine (-}-) with the first in the production 
of the total K according to the following equation: 

K,.. = 1 - (14) 


Postulate IX. Inhibitory Potential {1, pp. 258 ff; 2, pp. 73 ff) 

A. Whenever a reaction (R) is evoked from on orgonlsm there is ieft 
on increment of primory negative drive (Ir) which inhibits to a degree 
according to its magnitude the reaction potential (bEr) to that response. 

B. With the passage of time since its formation, Ir spontaneously 
dissipates approximately as a simple decay function of the time (f) 
elapsed; i.e., 

Ik = la X (15) 

C. If responses (R) occur in close succession without further reinforce* 
ment; the successive increments of inhibition (AIr) to these responses 
summate to attain appreciable amounts of Ir. These also summate with 
bU to moke up an inhibitory aggregate ((r), i.e., 

In = Ir + bIr. (15) 

D. When experimentol extinction occurs by massed practice, the 
Ir present at once after the successive reaction evocations is a positive 
growth function of the order of those responses (h), i.e., 

in - 1,84(1 - 10--«*‘'). (17) 

E. For constont values of superfhreshold reaction potential (bEr) set 
up by mossed practice, the number of unrcmforced responses (n) 
producible by massed extinction procedure is o linear decreasing 
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constant^ the weaker will be the resulting reaction potential of the link 
in question to the stimulus traces present at the time. 

B. The greater the delay in the receipt of the incentive by groups of 
learning subjects, learning and response^evocation conditions remaining 
constant, the weaker will be the resulting learned reaction potentials 
(bEbj), the shape of the gradient as a function of the respective delays 
being roughly that of decay with the lower limit of the extended gradient 
passing beneath the reaction threshold, i.e., 

J = bEk, = D X Vj X K X bHr X X Vi, (9) 

where, 


d = log S' of Vi - log S' of Vi. 


Corollary iv. The Summation (+) of Habit Strengths (2, pp. 60 ff.) 

If two stimuli, S' and S, are reinforced separately to a response (R) 
by N' and N rtmfammmts mpalivtly, and Iht ,.H» gcncralizts to S 
m & amoml of .Hi,, the summalian (+) of tht two habit strtngths 
at S wdl bt Iht same as wmld result from the equivalent number of 
reinforcements at S, i.e., 


•H. + ,H|. = ,H, + .Hk - ,H, X (10) 

Corollary v. The Summation (+) of Reaction Potentials (2, pp. 64 ff.) 

If Uvoslimul,, S' and S, are reinforced separately to a response (R) 

ZtenfaLfT ^ "E". 

forcemenls in ® tbe equivalent number of rein- 

torcements in an original learning, i.e.. 


.E» + .Ei = .E. + .E;_.E^i 

M the asymptote „/,E. Jj, iMi^ted trials. 

Corollary cl. Ue Withdrawal (..) of Habit Strength (2, pp. 66 ff.) 
habit rtrZ!g!Ztc)lll!!Ze^jifj^. ho withdrawn l^)from a larger 


.Hi = ,H, = £^:.iHi 
1 - .Hi 


( 12 ) 
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B. A stimulus Intensity (Si) generalizes to a second stimulus Intensity 
(Si) according to the equation, 

= b.Hr X 10-**“* X Vi, (21) 

where d represents the difference between Si and Sj In log units and 
B Jn = D X K X Vj X A, (22) 

and where (D X K) is constant and Vj Is the stimulus-intensity dynamism 
at Sj. 

C. In the case of qualitative stimulus differences, ordinary conditioning 
and extinction spontaneously generate a gradient of Inhibitory poten- 
tial (sIr) which Is a negative growth function of sin and d, i.e., 

8.!n - fl.ln X 10-‘^ (23) 

and In the case of stimulus-intensity differences, 

Bjn = ejit X 10"'^ X Vi. (24) 

Corollary xii. The Generalization of bHr and bEr on Sd as a Continuum 
(/, pp. 235 ff.; 2, pp. 89 ff.) 

When a habit is set up in association with a given drive intensity (Sn) 
and its strength is tested under a different drive intensity^ there will 
result a falling gradient of bHb and bEr. 


Postulate XI. Afferent Stimulus Interaction (1, pp. 216 ff.; 2, pp. 93 ff.) 

All afferent Impulses (s's) octive at any given instont, mutually Interoct 
converting each other into s’s which differ qualitatively from the 
original s's so that a reaction potential (.Er) set up on the basis of one 
afferent impulse (s) will show a generoUzotion fall to jEr when the 
reaction (R) Is evoked by the other efferent impulse (5), the amount of 
the change in the afferent impulses being shown by the number of 
f.n.d.'s separating the bEr's involved according to the principle. 


d 



(25) 


Postulate XII. Behavioral Oscillation (sOr) (1, pp. 304 ff.; 2, pp. 96 ff.) 

A. A reaction potential (bEr) oscillates from moment to moment, the 
distribution of behavioral oscillation (bOr) deviating slightly from the 
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function of the magnitude of the work (W) Involved in operating the 
monipulanda, i.e., 

n « 3.25(1.147i - .00984W). (18) 


Corollary ix. Conditioned Inhibition (2, pp. 74 fF.) 

Stimuli and stimulus traces closely associated with the cessation of a 
given activity, and in the presence of appreciable l^from that response, 
become conditioned to this particular non-activity, yielding conditioned 
inhibition (bIr) which will oppose sEr’j involving that response, the 
amount of AbIh generated being an increasing function of the Ib 
present. 


Corollary x. Inhibitory Potential (J,) as a Function oj Work (?, pp. 279 

ff.; 2, pp. 81 ff.) 

For a constant value oJ n, the inhibitory potential ({,) generated by 
i/M total massed extinction of reaction potential set up by massed 

” r/ ‘•^«!‘rated increasing Junction of the 

mrk (W) tnvolved in operating the manipulandum, which graduaiiy 

S- ir; 

•/ •- •/ 

‘'‘'tdum!^tT]mil^y%Ztiani 

‘^‘inctionoJreactionLLeZr.l 

accelerated increasing JmctioZ’ oU^ rZT'‘ 

required. ^ number of reactions (n) 


'"t" (, p. ,33, 3. pp. 1 

*"■="318 (.HJ 9ineru°«°'r«ta7'‘' 

qualitative continuum from the simn? gradient on Ih 

*""Pla learned otiachment of S, to Rt 
BjHr = X nt 

where d represents the difference * ‘ 

een Si and Sj in [.n.d/s, and 
8iIr “DxKvVsx n 
and where D X K X V • * ^ 

^ X V, IS constant. 
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magnitude greater than sLn, only that reaction whose momentary reac^ 
iion potential (bEr) is greatest will be evoked. 

Postulate XIV. Reoctlon Potential (sEr) os o Function of Reaction latency 
CetR) (2, pp. 336 ff.; 2, p. 105) 

Reaction potential (sEr) Is a negatively accelerated decreasing func- 
tion of the median reaction latency (atR), l.e., 

«Er = 2.845(atR)-«». (28) 

Postulate XV. Reaction Potentlol (bEr) os a Function of Reaction Amplitude 
(A) (1. pp. 339 fF.;^, p. 108) 

Reaction potential (sEr) Is on Increasing linear function of the Tarcha- 
noff galvanic skin reaction amplitude (A), i.e., 

bEb = .02492A. (29) 

Postulate XVI. Complete Experimental Extinction (n) as a Function of Reac- 
tion Potential (sEr) (i, pp. 227 fi.iZ, p. 110) 

A. The reaction potentials (bEr) acquired by massed reinforcements 
are a negatively accelerated monotonlc increasing function of the 
median number of massed unreinforced reaction evocotions (n) re- 
quired to produce their experimental extinction, the work (W) involved 
in each operation of the manipulandum remaining constant, i.e., 

rEb = 4,00 - + .46. (30) 

6. The reaction potentials (bEr) acquired by quasi-distributed rein- 
forcements are a positively accelerated monotonic increasing function 
of the median number of massed unreinforced reaction evocations (n) 
required to produce their experimental extinction, the work (W) 
involved in each operation of the monipulandum remaining constant, 
i.e., 

sEr == .1225X 10'»"“-i-2.114. (31) 

Postulate XVII. IndivTduol Differences (2, p. 115) 

The "constant” numerical voloes oppearing in equations representing 
primary molar behavioral lows vary from species to species, from 
indlviduol to individual, and from some physiological states to others 
in the same Individual at different times, all quite apart from the factor 
of behovioroi oscillation (bOr). 
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Gaussian probobilify form In being leptoVurtIcwifh/3j at about 4.0; l.e., 
the distribution is represented by the equation { 4 , p. Ixlll), 


y * 



B. The oscillation of eBa begins with the dispersion of approximately 
zero at the absolute zero (Z) of bHr, this at first rising as o positive 
growth function of the number of sublhreshold reinforcements to 
an unsteady maximum, after which It remoins relatively constant 
though with increasing variability. 

C. The oscillations of competing reaction potentials at any given 
instont are asynchronous. 


Corollary xiii. Response Generalization {7, pp. 316, 319)* 

A. The contraction of each muscle involved in a habitual act varies 

Us instant to instant (,0,) about a central rcinjorccd region oj 

mtensily which is approximately normal (leptokurtic) in distribution; 
this constitutes response-intensity generalization. 

B. Where several muscles jointly contract to produce a given habitual 
act the contraction of each muscle varies more or less (,0») independ- 
ently of the others, producing a qualitative deviation from the central 

contractions originally 

reinforced; this constitutes qualitative response generalization. 

'’""id M w"," W ond the Reaction 

Threshold (slj (j, pp. 322 ff,; 2, p, loi) 

obo«' '’PP"''-’’'' 

zero (Z) of reoction potential (,Er), l.e., 

. sU-Z+B. ( 2 d) 

-e-OTy reoction 

exceeds the reoction threshold, i.e., unless, 

> aU. (27) 

Reaction Potentials (,E,) 

reactions (R) omr°'ln^‘^‘^^ ”'cre incompatible 

■Thcdrriv«io„rf,hi, came instant, each in a 

in Chap, „ 2. 



2. Simple Trial-and-Error Learning 


We shall begin our elementary account of systematic behavior 
theory with the consideration of trial-and-eiror learning, one of the 
less complex, more common, and better known of the behavior 
processes. 

A Concrete Example of Simple Trial-and-Error Learning 
Consider the following. A hungry but very tame albino rat, about 
three months of age, is placed in a small rectangular cage; the 
cage is made of rather coarse wire screen so that the animal’s 
behavior may be clearly observed. A small brass rod with a short 
crosspiece at its end projects through one of the meshes of the 
screen a half inch or so into the cage. A short distance outside the 
cage this rod is pivoted on an easily moved bearing in such a way 
that the crosspiece -within the cage can be moved freely up and 
down. A weak spring outside the cage holds the portion inside, 
upward against a restraining shoulder. However, a slight pressure 
on the crosspiece will depress the end of the rod a few millimeters, 
thereby closing an electric circuit which activates an electromagnet; 
this in turn releases from a magazine into a food-cup placed on the 
floor of the cage a small cylinder of food much relished by the rat. 

When first placed in the cage the rat remains quiet for a time, 
looking about from a somewhat crouching posture suggesting fear. 
Gradually he relaxes from the fear posture and sniffing about be- 
gins to examine his surroundings, pausing frequently to wash his 
face or just sit for a time, and then resuming his exploration. At 
length he approaches the food-cup, which, having the odor of 
food about it from previous use, focuses his attention on its v'icinity. 

15 
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Corollary xv. Secondary Reinforcement by Fractional Anledaling Goal Reac- 
tion {ta-^ So)^ 

When a stimulus (S) or a stimulus trace (s) acts at the same time 
that a hitherto unrelated response (R) occurs and this coincidence is 
accompanied by an antedating goal reaction (ro), the secondary rein- 
forcing powers of the stimulus evoked by the latter (so) will reinforce 
S to R, rise to a new S R dynamic connection. 
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1. Why are the false or inappropriate reactions (R_), such as 

sniffing in a given corner, gradually abandoned? The answer is 
that abandonment occurs to a considerable extent because of 
experimental extinction (IX A; ix).‘ It is also due in part to the 
strengthening of which competes with R 

2. Why are erroneous reaction^ (R— *s) often repeated many 
times before being abandoned? Experimental extinction is a cumu- 
lative process and numerous repetitions are required to generate 
enough internal inhibition (Ir_ and sIr.) to diminish the initially 
dominant reaction tendency or potential (bEr.) to a strength less 
than that of the next strongest reaction potential (IX G). 

3. What determines the order in which the several responses are 
tried by the organism? This is partly determined by the stimuli 
which chance to impinge upon the organism at any given time, 
but mainly by the relative strength of the several competing reac- 
tion potentials in the hierarchy of rEr’s possessed at that time by the 
organism. 

4. Why does one organism follow a different sequence of reaction 
from that followed by another organism, the stimulation given 
being parallel? This is mainly because the previous history of an 
organism differs considerably from that of other organisms in the 
relevant hierarchy of the reaction potentials laid down in each. 

5. When the correct reaction (R4.) finally occurs, what strength- 
ens this response tendency? The aiKwer to this is in accordance 
with the principles of reinforcement (III; IV). 

6. Why does the organism often return to R_’s after one or 
more successes (R+’s)? This is in part because of behavioral oscilla- 
tion (bOr) (XII A, C) and in part because of recovery from pre- 
ceding experimental extinction (IX B). 

7. How can we be sure both that the R+*s increase in strength 
following reinforcement and that the R.«*s decrease in strength 
through failure of reinforcement? This is revealed by changes in 
the respective response latencies. The implication is that the R+ 
responses decrease in reaction time and the R_ responses increase 
(XIV). 


'Upper-case Roman numerals in parenthesis (»•*•» tX A) here and elsewhere 
throughout the text indicate postulates relevant to the subject being discussed at that 
point. Similar insertions in lower-case Roman numerals (i.e., ix) indicate relevant 
corollaries. 
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He rears on his hind legs to sniff the cage wall above the food-cup, 
and his paws chance to press lightly on the crossbar of the rod. 
This at once closes the electric circuit, a food pellet drops into the 
cup, and presently the rat finds and cats the pellet. The random 
exploratory activity then is resumed much as before except that 
it is confined more closely to the region of the food-cup. This of 
course increases the probability of the chance pressure of the bar, 
and after some minutes, much snlfhng, and considerable face- 
washing the rat touches the bar again and cats the resulting food. 

The learning process continues for fifteen minutes or so. As 
practice goes on, the amount of random sniffing and exploration 
grows less and less, and the time required to secure each succeeding 
pellet diminishes, on the whole, until practically all irrelevant 
activity has disappeared and the rat spends the time between 
operations of the bar exclusively in eating the pellet he has secured 
by his immediately preceding pressure. After two or three such 
practice sessions the animal has, by a process of irial-and-ertofi 
fully learned to secure food from the apparatus. 

With the exception of the originally very feeble tendency to 
press the bar, the various movements which the experimental cage 
situation in conjunction with the animal’s need of food originally 
evoked have all disappeared. This process of the differential 
strengthening of the one reaction tendency in relation to the 
rompeting reaction tendencies is known as trial-and-crror selection- 
e responses which resulted in a rwJuction of the need will be 
known appropriate, correct, or right responses (R+), whereas those 

w ic I not do this will be known as inappropriate, incorrect, or 
xvrong responses (R_). ^ 

An Elementary Theoretical Analysis 

DrocppH simple trial-and-error learning before us we 

demanriintr^ ^ ^tement of the more obvious theoretical questions 
exolanation^Tlf ^ preliminary examination of the 

- and a notion postulates utilized in the deductions 

toough The tracing 

preliminary nualh (6) will give a certain amount of 
the comprehLion perspective useful for 

which mLupTehuIwXrC-' 
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Moreover, the muscles whose contractions will be necessary to 
bring about a state of affairs which will reduce the organism’s 
need will all have been used in adaptations to other situations, so 
their use in an unfamiliar problem situation will not be new either. 
The novelty here, again, will be in the combination of the muscles 
and the combinations of the reaction intensities required of each 
to produce the reinforcing state of affairs. It is evident that when 
encountering any novel situation an organism at a fairly early 
stage of life has the potentiality of an almost limitless variety of 
reaction tendencies, consisting of inherited tendencies overlaid and 
modified by those acquired through 
all the learning of its preceding life. 

On the basis of the above prelim- 
inary analysis we may represent 
more precisely the origin and pre- 
cipitating conditions of simple trial- 
and-error learning for the present 
expository purposes as follows. A 
correct reaction, R+, is connected 
to a stimulus aggregate, S+, in con- 
junction with other accidentally 
accompanying stimuli or stimulus 
aggregates. Similarly, an incorrect 
reaction, R_, is connected to a 
largely different stimulus aggre- 
gate, S_. Finally, S+ and S_ are presented simultaneously, together 
with one or more chance stimuli or stimulus aggregates novel to 
each. The resulting formula is shown diagrammatically in Figure 
1. An experimental situation representing an opportunity for the 
subject to make the responses separately may be seen in Figure 13 
(p. 50). Reinforcement always follows R+, but never follows R — 

A not unusual complication of such situations is that either S+ 
or S_, or both, will not be exactly the same as the stimuli directly 
conditioned to R^. and R_ respectively, but will fall on a generali- 
zation continuum more or less remote from the stimuli S+ and S— 
which were originally connected to the respective reactions. Under 
these conditions, S+ and S_ will tend to evoke their respective 
reaction potentials (bEr) if not too remote on the generalization 



F 1 o V R E 1 . Diagrammatic represen ta< 
tion of the divergent reaction poten- 
tials arising from the conjunction of 
two stimuli connected to incompatible 
responses. R+ represents pushing a 
small brass bar to the left, and R- 
represents pressing downward a bar 
similar in appearance but placed in a 
horizontal, rather than vertical, posi- 
tion. {See Figure 13, p. 50.) 
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There are many more questions which we shall need to ask and 
attempt to answer about simple trial-and-crror learning, and some 
of the briefly stated answers to those above will need to be elabor- 
ated in considerable quantitative dettul. Nevertheless, if the reader 
has really understood the seven explanations just given he will be 
well on his way to a comprehension of the more detailed quantita- 
tive explanations now to begin. 


Conditions Antecedent to Simple THol-and-Error Learning 
From the preceding considerations a number of the essential con- 
ditions of simple trial-and-crror learning are evident. One of these 
is that the situation in conjunction with the need or needs of the 
organism at the time (77, pp. 226 ff.) will produce a variety of 
more or less persistent tendencies to action. The origins of these 
tendencies are various. The process of organic evolution, through 
inheritance, provides the organism with a considerable variety of 
innate reaction tendencies (bUr) at the very outset of life; this 
furnishes an adequate basis for genuine trial-and-crror learning. 

e process of antecedent trial-and-crror learnings in a great but 
miscellaneous scries of situations selects, joins, and molds the in- 
erite tendencies to action so that one stimulus combination \rill 
movement or muscular-contraction combination, another 
partially different stimulus combination will evoke a different 
tv^ern^f so on. Since the number of different 

that at limited, it necessarily comes about 

recentor ^ ^ early stage in the life of the organism all the 

reaction or T connected in various combinations to one 

tered by the o” result, when a “new” situation is encoun- 

aS*::;T„Zt:e:>7 

it is inevitable that the stimu7““ adapted to. 

will, through the nr! • ^ , components of the "new” situation 
186; X A) tend to ^ of stimulus generalization (77, pp. 185, 
reactions and reactio7 intensity all of the 

has gone on for a **“|''''to learned. Thus, after life 

of a “new” situatio ^ stable period, the stimulus components 
entirely in the fact the novelty consists almost 

combination, t e stimulus elements come in a neW 
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positive or the negative reaction potential (if either) is initially 
greater, and the magnitude of the difference between them in 
case any difference exists. In order to have available a convenient 
means of reference to the reaction potential characteristics of the 
two competing response tendencies, we shall represent that of the 
“correct” reaction by the symbol and that of the “incorrect” 
reaction by the symbol bEr_. 

The above considerations lead to the need for a statement of a 
number of quantitatively distinguishable combinations of ante- 
cedent conditions or cases assumed at the beginning of the trial- 
and-error process. These are as follows: 


Case I 

= 2.5<r; 

bEr_ = 2.5(r. 

Case II 

sEe, = 4.5o-; 

bEr “ 4.5cr. 

Case III 

oEe^ = 5.0<r; 

bEr = 2,0o'. 

Case IV 

sEe, = 2.0(r; 

bEr_ — S.Off. 

Case V 

bEr^ = .856(r; 

bEr, = 5. Off. 


Cases 1 ond II Where sEr* and rEr. Are Equal 

We have before us at this point the task of tracing quantitatively 
the characteristic events of distributed-trials simple trial-and-error 
learning under two related sets of conditions, i.e., Cases I and II 
where bEr^ — bEr^ and both may be relatively weak or relatively 
strong. Proceeding directly to the consideration of Case I, we take 
and rEr_ aX the comparadvely low level oC Z.5<r each. This 
means that the probability of each reaction occurring will at the 
outset be .5 or 50 per cent. Let us first consider the 50 per cent of 
organisms which respond to stimulation with R+ and so receive 
reinforcement. According to the present system, the value of bEr^ 
after one reinforcement is given by the equation (77, p. 120): 

AbEr^ = M - bEr, - (M - bErJIO-' (32) 

where, 

AsEr^ is the increment to the reaction potential from a single 
reinforcement under the stated conditions; 

M is the reaction potential at the limit of practice under the 
conditions of reinforcement obtaining, here taken as 6.0<r; 

bEr^ is the reaction potential just previous to the reinforcement 
under consideration, here taken as 2.5<r; and 
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continuum. But since R^. and R_ by assumption ^ P"; 

formed simultaneously, only the one of the two wh.ch is at the 
moment the stronger will be performed (xiv). 


Quantitative Assumptions 

A theoretical analysis of behavior phenomena is fairly compta 
even when the simplest possible conditions arc assuined. Accorf- 
ingly we shall begin our analysis with a radical simplification that 
will limit our competing reaction potentials, Nvhich are super- 
threshold in magnitude (XIII B; 77, pp. 304 fT.), to two responses. 
Secondly, in order to eliminate the complications inherent in 
perseverational stimulus traces in those experimental extinction 
effects (IX A; 77, pp. 258 ff.) which are susceptible to spontaneous 
recovery (IX B; 77, pp. 258 ff.), we shall assume that an interval 
of 24 hours occurs between successive trials. This means that in 
case of an erroneous reaction no further or correctional choice is 
permitted on that trial (72). Wc shall assume further (X A) that 
at every reinforcement of R+ there is not only a direct gain to R+ 
in the increment (A) in bEr^ but there is a positive generalization 
transfer from AsEr, to sEr.; and at every lack of reinforcement of 
the occurrence of R_ there is not only a loss in the reaction potential 
from AsU, to R_, but a generalized loss transfer of AsIb. to bIr,* 
Finally we shall assume that the statement of the competing 


reaction potentials utilized in the deductions represents the state 
of affairs after external inhibition (11, p, 217), incidental to the 
presentation of stimuli in combinations different from those under 
which original reinforcement occurred, has already taken place. 

ecaiwe of the wide separation of the trials here assumed we shall 
call this distributed-tnals simple trial-and-error learning. This situation 
IS in contrast to one involving closely successive repetitions, which 
IS known as mvscd-tMs simple trial-and-error learning. The distrib- 
uted-trials analysis will occupy us throughout the next two sections, 
learning''^ massed- trials simple trial-and-error 

certain quantitative conditions which are so central 
cannni error process utilized in our exposition that they 

tbn nltemM ‘^ese is the amount of reac- 

tcndencies ' frn'^” ^baracteristic of each of the competing reacdon 
dencies, from which there follows the question of whether the 
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of the head. This usually near identity of the responding organs for 
and R_ gives the two responses much in common with most 
animals, though by no means complete identity of response. If 
they were identical, generalization would of course be very high. 
As things stand in regard to the responses it is clear that there is 
bound to be less stimulus generalization than the marked similarity 
of the stimulus combinations would produce. 

Generalizing on the preceding considerations, we arrive at the 
following major corollary: 

Corollary xvi 

A. When an organism acquires two reaction potentials, bEr^ and 
two stimuli oj which are very similar and the responses oj 
which are different though they often involve substantially the same 
muscles, there will be in addition to a gain in aEr^, also a generaliza- 
tion of reaction potential from AbEb^ to bEr,. 

E. When one of the habits (bErJ undergoes partial or complete 
^^perimental extinction, there will be, in addition to the loss in bEr^ 
also a generalization of AbIr, from this false reaction tendency to the 
other or correct one (bErJ 

Evidence has been found verifying both part A and part B of the 
^ ove corollary. This is as follows: 

c_. A. Holland, using the apparatus shown in Figure 13 (p. 

)i trained 115 albino rats on one manipulandum alone for 20 
^3.1s. This bar was then retracted and the second manipulandum 
^ extended into the animal’s chamber. The results showed that 
training on the first bar greatly facilitated the learning on the 
second bar. For example, the latency of the twelfth and thirteenth 
*3 s on the first bar was the same as that of the third and fourth 
*^als on the second bar (3, pp. 28-29). Thus Corollary xvi A finds 
^n^irical confirmation. 

3rt B. In a somewhat similar experimental situation four 
^parable groups of 25 albino rats each showed on the average 
on ^ cent as many trials to the experimental extinction 

one habit immediately after the other habit had been extin- 
^«hcd as they did when the same habit was extinguished first, 
Uo'^ marked pcrscvcrativc generalization of cxtinc- 

^ c ects (9, p. 247) and substantiating Corollary x\'i B. 
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i is the exponential constant characteristic of the particulai 
organism under the given conditions of learning, here taken a; 
.091.’ 

Substituting appropriately in equation 32, we have, 

= 6.0 - 2.5 - (6.0 - 2.5)10“09i 
-35 3.5 

1.2333 
= 3.5 - 2.8386 
AaEa^ = .6614. 

It follom that the ,E.. qfl„ *e reinforcement must be, 

»E». + A.E,. = 2.5 + .6614 = 3.1614. 

tionVrn'v of ^itnulus gencrallza- 

are in rnm f ” trial-and-error situation assumed here there 
S R Thp!^°f ^ evocation of and R_ the explicit stimuli, 
together \rith ^ stimuli within the body of the organism 
up a con^rderar'' There make 

response In the “““"O" stimuli conditioned to the 

"=Sd?cr.he pr;“‘ -P'-.y Utese trill be 

On the other hanH n • 

vertical bar (SJ to thfl'r *" Pushing of a small brass 
sists in Ute p. 50), whereas R- con- 

but sitLcd abou'tT'’^'^ ® horizontal bar (S_) of the same 
separately are rathpr A-fr^ ‘”tdies away. These objects considered 

•rial-and^r pmc« ‘^"“s Actually, during the 

time. In short it is evid k ™ presented at the same 

there tvillbeaconsideX T' “ *= ='^”“11 are concerned 
generalization. ®ugh by no means complete, stimulus 

Moreover, we have he 

involve the use of practir^? “tnplication in that R+ and R_ often 
or less different tray in th tnuscles, though in a more 

one bar would be moved animal, however, 

' Thii u .boot Eve wa 'he other with the side 

£ •'>'* in nn experiment (». 

“oaj necesuri^ ^ « taken in the most closely of any known 

-e e4,^„“ tvi'Sa '"■'V 

- has been taken arbitrarily from the 
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Moreover, the new sEr, will be, 

bEr_ -b AbIe. = 2.5 - .4724 = 2.0276. 

It folloNvs that after the false response (IL.) has been made, the 
respective reaction potentials of this second group of subjects will 
be, 

bEe, = 2.4055; sEr^ - 2.0276. 

Now because of the fortuitous nature of the oscillation function, 
it would be quite impracticable to trace out in arithmetical detail 
the consequences of all of the various possible combinations of 
correct (+) and incorrect (“) responses as they might occur in a 
particular organism, to say nothing of all the organisms. We shall 
accordingly resort to an approximation to this. This is suggested 
by the practice among experimentalists of pooling the response 
scores of all the organisms \vithin a given experiment which are 
regarded as comparable in learning rate. Specifically, we shall 
calculate a weighted average of the first-trial bEb^ scores of the two 
groups of subjects calculated above; this will be taken as the gEs^ 
value of the group as a whole at the beginning of the next trial, 
and the same will hold for the bEr. value. This amounts to adding 
together the products of the two bEr^ values, each multiplied by the 
proportionate chance of occurrence, .50. Accordingly, at the 
beginning of the second trial, 

bEb^ = 3.1614 X .50 + 2.4055 X .50 
= 1.5807 + 1.2027 
= 2.7835. 

Sirrularly for bEr_, we have: 

bEr. = 2.6323 X .50 + 2.0276 X .50 
= 1.3162 -f 1.0138 
= 2.3300. 

Thus at the beginning of the third trial we have the competition 
between these two means; 

bEe, = 2.7835 and sEe. = 2.3300. 

Now the probability of the responses of the organisms as repre- 
sented by these two means would cridcnlly be substantially the 
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We shall accordingly assume (xvi A) that each AsEr^ generalizes 
20 per cent to bEr_. This means that in the present case, 

AbEr. = .6614 X .2 = .1323. 

Therefore the sEr , as an indirect result of the reinforcement of sEr^, 
also undergoes a gain, i.e., 


bEr_ + AbEr, = 2.5 + .1323 = 2.6323. 

As a result of the preceding computations, those subjects which 
respond correctly on the first trial should have for the two compet- 
ing reaction potentials ajter the response (and its reinforcement), 

sEr^ = 3.1614; 8Er_ = 2.6323. 

in the reaction potentials in 
respond to K_. Here we assume that the 
rcaction^nt^^f'^^i*^^^? inhibition possible of generation from a 
iEr in n ?■ experimental extinction equals the 

‘houBh\i.hou. 

constants as A.E,..’ Accordlgl^l 


= bI». — ,I,_ X 
= 2.5 - 

1.233 

. = 2-5 - 2.0276 

■■■ A.I, = .4724 


inhibij^’ inamem'au'*' "“‘'jocts 20 per cent of this 

response {R, ) Therernr,°.i,®™'™''^“ “ competing correct 
nerefore the generaliaed inhibition will be. 


•4724 X .2 • 


- .0945. 

negative, is ^ these subjects, since inhibition is inherently 


+ A«Ib 


= 2.5 + (-.0945) = 2.4055. 


' Th«c compuutioiu ww^. 

W * change the equation 13 became 

Wievni iK,, i, ^ ^e ofSe rcults but it ia not 
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TABLE A table showing the theorcUcal progress of distributed-trials simple trial- 
and-error learning where the asymptote (M) of the primitive bEr*. learning curve 
is taken as 6v, and where the competing reaction potentials begin equal at the rela- 
tively low level of 2.5o- (Case I). 



OROINAL NUMBER Or DISTRIBUTED RIALS 


r t Q u R E 2, Graphic theoretical representation of the increase In bCr*. and the general 
decrease In bCr- as the distributed trials of a timple trial-and-error learning progress 
(Case J). Each trial is counted as one, regardless of whether tlic response is R* or R_ 
Note the Intlected form of the *Erv curve and the slight rise In the #Eit_ cur\'c after (rial 
A. Plotted from values presented in Table 1. 
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same as would be the probability of the dominance of one of any 
other t^vo events represented by means in ordinary statistical 
practice.^ Assuming that the standard deviation of the respective 
means alike is 1.0, the standard deviation of the difference ((Td) 
between the two means (XII G) therefore becomes: 


ffd — “h oL 
= 2a* _ 

ffd = a V2 = 1.414. (33) 

Accordingly we may write the following equation, using x in place 
of d to conform with statistical usage: 


Substituting, 




(34) 


^ - 2.3300 


1.414 
••• ~ = .3207. 

^4 


1.414 


fmd that, function, eg., Guilford ( 7 , p. 538), we 

and that P+ = .126 + .500 = .626, 

P- = f-00 - .626 = ,374. 

By successively repeating 

using in the weighted-tn ° ^ Process just described, each tij 

in the preceding comniuar P+ p_ values secur 

appearing in Table 1 mri t-”*’ obtain the theoretical rest 
•able and figures shows thaf'^"^” ^ inspection of 

* l« . 


f ' "" Oin-crence in L out 
neccMarily utilized and of the Icptol 
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The bEu^ increases with successive trials, at first with a positive 
acceleration which later inflects into a negative acceleration. This 
differs from the theoretical primitive curve of learning (IV). 

The sEn. decreases with successive trials, with a negative acceler- 
ation, the decrease between trials 4 and 5 changing to an actual 
though slight gain which continues as far as calculated, ultimately 
recovering two-thirds of the sEr. lost during the first four trials. 
This paradox is due to the fact that R.f reinforcements (AsErJ 

generalize (xvi A) appreciably to R 

The curve of learning shown as Figure 3, a probability function, 
is not dependent upon either primary learning function alone 
(sEr^ and 8 Er_, Figure 2), but upon the difference between them. 
This means that when the probability learning curve reaches its 
maximum or 100 per cent + responses, bEr^ has by no means 
reached its asymptote nor has bEr, nearly reached zero. This 
probability curve of learning approaches the conventional “curve 
of learning” in form (7d, p. 575), though a curve fit shows that 
despite a very close approximation it fails systematically.® 

We turn next to Case 11, where at the outset bEb^ =» bEr_ and 
both values have a magnitude of 4.5cr. The computational pro- 
cedure is exactly the same as that given above which generated 
Table 1. The results for Case II are not particularly different 
except as to the curves of bEr^ and bEb,. These are presented in 
Figure 4. An inspection of this figure shows the same rise in sEr^ 
and the same fall in bEr, as appears in Figure 2 except that the rise 
of bEr^ is less and the fall of bEr. is greater. Moreover the curve of 
bEr, has no terminal rise, as it does in Figure 2. The computations 
show that 25 trials are required to reach a .978 per cent dominance 
of bEr^, as compared with 10 trials for Case I. 

Generalizing on the preceding considerations we now proceed 
to formulate our first theorem: 

THEOREM 1. In distributed-trials simple trial-and-error learning 
where there are only two superthreshold competing reaction potentials^ 
and where at the beginning bEb* = bEr,: 

A. There are evolved two primary curves of learning (bEr^ and bEr,) 
as a function of the number of trials {N). 

* The fitted equation obtained is, 

p « .755(1 - 10- «»N) + .245. 
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late Table 2 according to these assumptions. An inspection of this 
table shows that the two primary learning curves naturally begin 
at quite separate points. Their course is so brief to the 99 per cent 

TABLE 2. A table showing the theoretical progress of distribulcd*trials simple trial- 
and-error learning where the asymptote of the gEa learning potentiality is taken as 
6.0^ and where the competing reaction potentials are initially: bCr, ■■ 5.0<r and 
sEn. - 2.0tf (Case JII). 


Reaction potential (bEr) Reaction probability 

Trial £ (P) 

nuenber ffd p+ p- 

1 5.0 2.0 2.122 .983 .017 

2 5.184 2.031 2.230 .987 .013 

3 5,338 2.056 2.231 .990 .010 



ORDINAL NUMBER OT DISTRIBUTED TRIALS 

FIGURES. Graphic representation of the theoretical curves of the courses of bEr+ and 
bEr- during a simple trial-and-error process by distributed trials, where hEb+ originates 
at 2.0<r and sEr- at 5.0<r (Case IV), Note that the bEr+ curve begins with a period of 
fall (through generalized dslR+), and that the bEr- ctirve terminates with a period of 
rise (through generalized AsEr-). Plotted from values shown in Table 3. 

level of bEr^ that they have no theoretical significance, except as to 
methodology. Accordingly we present no graphs of this case. 

We next proceed to the consideration of Case IV, where bEr* 
= 2.0<r and bEr. = S.Oa. Tabic 3 is based on the same computa- 
tional methods as those used in Tables 1 and 2, and from this 
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B. The proportion oj reinjorcement (+) trials increases as trial and 
error continues, whereas that of the non-reinjorcement ( — ) trials 
progressively deaeases. 

C. The curve oJ bEr^ = J{K) in both Case I and Case II at first 
rises with a positive acceleration and later becomes negatively accelerated. 

D. The curve of bEr. — J{N) in Case I at first falls with a negative 
acceleration and later rises slowly but continuously, with a positice 
acceleration, then with a negative acceleration. 

E. The greater the initial values of bEr, and bEr_, the smaller will be 
the relative rise in bEr, as a result of the trial’and-error process, and 
the greater the fall in bEr.. 


probability curve of learning is a function of the amount of 
difference between the two primary curves of learning, bE„. and bEb.. 

t the first trial the probability of occurrence of the respective 
reactions approaches equality, i.e., it is .50 .50. 

H. OTTO cj tht probability cane 'oJ Iremiag is a negatively 
aueteratid nse approaching the conventional learning /unetian in shape. 

"'f ’'5"? distribute 

process. In a situation there 'nf and E ' O’' 

ever bEk. were nearly equal, how- 

seconds at the firsurituo ffs*' ’ a "' ‘^''••'nE'd from 1.4 

thus showing a «ilttrKf • onc*hundredth trial, 

at the first trial to .8 seSnd^uhe't' 

a marked gain. This con^tJt ^"^-^“ndredth trial, thus showing 
Theorem 1 A. ti es a partial empirical verification of 


0.1)° E^rTand G,‘fhoughthe toe Theorems 1 B 

error series, if nres„veH values of Holland’s trial-and 

A probabi,i,;™’r 1 C and 1 D 

trial-and-error learning W distributed-trials simpl 

Figure 8 (p. 37 )^ be/ ° 

at the outset since the p on th fi ”” whole a little the stronge 
between the third and^fifth t^* i ^ around .30. However 

point on, Theorem 1 H finrf. ^ Thus from tha 

1.1, IV. e„u V, 

As stated above (p 21)' • ' T 
Pursuing ,he method of eompn^on 

already explained, we formu 
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table Figures 5 and 6 are plotted. An inspection of these figures 
shows at once that Case IV is radically different from the three 
preceding cases. We derive the following conclusions: 

1. bEr^ begins with a fairly protracted period of loss (from gen- 
eralized AbIr.), after which it rises rather rapidly. 

2. bEr. follows fairly consistently a negatively accelerated loss, 
after which it shows a fairly protracted period of gain (from gen- 
eralized AbErJ. 

3. The curves of bEr^ and bEr_ cross. 

4. The curve of correct-response probability follows a charac- 
teristically sigmoid course with a positively accelerated rise at first 
which later turns to a negatively accelerated rise. 

We now proceed to Case V, where bEr. = S.Oo’andaER^ = .856<r. 
It must be noted that since the reaction threshold is .356o', this 
leaves .856(r — .356a’, or .Sa" as a superthreshold (sErJ value. By 
computations exactly analogous to those employed in the four 
preceding cases, the outcome represented in Table 4 has been 


TABLE 4. A table showing the early details of the trial-and'Crror process where 
bEr^ « .856ff and sEr. “ S.Otf (Case V). Note that the reaction threshold (bLr) 
is taken as which leaves as the superthreshold value of bEr,.) .656o’ — .356^, or 
.5o-, which is represented by the symbol bEr*. Note also that as far as carried bEr^ 
and bEr. grow progressively less. 


Trial 

Reaction potential (sEr) 
in V 

d 

Reaction probability 
fp) 

number 

R+ 

bEr 

R. 


P+ 

P- 

1 

.856 

.5 

5.0 

-3.182 

.001 

.999 

2 

.812 

.312 

4.056 

-2.648 

.004 

.996 

3 

.664 

.164 

3.294 

-2.214 

.013 

.987 

4 

.555 

.055 

2.682 

-1.858 

.032 

.968 

5 

.396 

-.007 

2.199 


.000 

1.000 

6 

.266 

-.090 

1.783 


.000 

1.000 


obtained. As the trials go on practically only R_ responses occur, 
and the generalization of the AbIr.’s resulting from these unrein- 
forced responses very soon brings the already weak bEr^ to a nega- 
tive, i.e., subthreshold, value which grows less and less with each 
trial. But (XII B) the range of oscillation (bOr) in this region 
approaches zero as bEr* approaches absolute zero. There are several 
consequences of this, though none is fully represented in Table 4. 
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curve of **'* **‘wretical course of a probabiI5ty-of-reaction 

Table 3 and Figure 5 M of the two reaction potentials shown in 

a-d F,,„ra 5. No« ,hc marVedly .h,p. thi. cuL 
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Proceeding to the empirical evidence concerning these theorems, 
we find nothing specifically relevant to Theorems 2 and 3. There 
is, however, a certain amount of continuous-trials trial-and-error® 
evidence on Theorems 4 and 5. It was found in hitherto unpub- 
lished data of an otherwise published study (P) that 72 albino rats 
carried out this latter type of learning, with 39 trials by the median 
animal. These results were manipulated according to the equally 
weighted Vincent method and the per cent of R^. reaction evoca- 
tions at each of the 39 trials was calculated. These data were then 



NUMBER or RtINFORCtWENIS 

FIGURE?, Vincent curve of the empirical probability of correct (R+) reaction evoca- 
tion in the course of simple triaI-and*error learning by the continuous-trials method. 
Note the similarity to Figure 6, which represents the results of a theoretical analysis of 
the same case of learning but by distributed trials. 

averaged by threes. The resulting values are represented graphically 
in Figure 7. There the initial portion of the empirical curve cor- 
responds closely to the theoretical one (Figure 6), but the latter 
portion does not, probably because the training was not continued 
through enough trials to show the final slowing effect. 

In regard to Theorem 5, it has been reported (P) that out of 83 
subjects submitted to trial-and-error learning by continuous trials 
under conditions substantially like those of Case IV (or Case V), 
seven animals responded with complete failure of R^. responses. 

• See a subsequent section (p. 38) devoted to this subject. 
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One is that Oi will grow progresavely smaller as the trials continue, 
which will make — increase and this taken alone \vould make p+ 

decrease. A second result is that whenever the reaction potential 
falls below the reaction threshold (aLs) no response based on that 
momentary reaction potential can take place. This means that as 
flEa. approaches bLr a progressively larger proportion of the sEb* 
values will fail to compete with sEr., which will tend to decrease 
the p+ values as the number of trials continue. And, finally, when 
(median) becomes negative (below the reaction 
t es old, bLr) this will still further decrease the competition of 
sba, with bEr. until p+ will ultimately become zero and p_ will 
T* point in the consideration of 

- nnn^ ^ e 4 this is symbolized by an abrupt transition to 
mealrin*, "" negative value, though strictly 

bE hera ^ negative value of 

dUp^lon of 'ho ‘^minishing .O. 

foUo'Xg'Sers;*' Preceding considerations, we arrive at the 

P = .QQishriof,, At n learning process up to 

Theorem 1 H. ” essentially the general course described in 

fairly marked loss followed h. f^egins with a brief but 

Kgh Jails with a ntgalivc aftf 

theorem 4. rfc cu„„ r ’’“‘‘'"'t ItlsT rists pcrctptiblf. 
c characlmstic pasithsh aZuftT 
<0 a slawtr nsgativily acclcratrd ^ ‘‘ 

all a sigmoid figure. toward 100 per cent, making in 

theore]^j 2 rt^/i 

or ‘'•S™ “"’'A .E., = 

""'■'’■f""; to nsgalias valurs hlau, Ih ’ff”* 

y oscillation values almost response threshold, the momen- 

reinjorcements will almost nevT" 

process will be a biological Jailur^^^^’ the learning 
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THEOREM 7. Tests madt during the simple learning process will 
show that distributed trials will be more efficient than massed trials. 

But since (ix) conditioned inhibition (bIr) has much the status 
of an ordinary habit and so does not dissipate appreciably with 
time, and since, 

sEr — bEr — bIr, 

there follows our next theorem: 

THEOREM 8, Tests applied 24 hours after the completion of a given 
amount of learning will shoWy other things equals that distributed- 
trials learning is more efficient than massed-trials learning, though the 
advantage is not as great as will appear daring the learning processes. 



ZO 40 60 80 lOO 

KUMBER OF FREE-CHOICE TRIALS 


figures. Probability learning curves of simple trial*and.eiTor learning by distributed 
trials (above) and by massed trials (below). Adapted from Holland (3, p. 39). 

Unfortunately there is not very much empirical evidence bearing 
directly on the theorems of the present section, so far as simple 
trial-and-error learning is concerned, though there is a great deal 
of pertinent evidence from other types of learning: 

1. There is a wealth of evidence regarding the matter of reminis- 
cence in rote and other forms of learning (13, pp. 263-270) which 
substantiates Theorem 6. 
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Yet when tested 52 hours later, after Ie would have been completely 
dissipated (IX B), they responded with 37, 4, 0, 1, 1, 2, and 4 
reactions respectively, showing that the sEb was even then existent 
to a small but superthreshold amount in all but one animal (P, 
p. 249). These results accordingly give a fair empirical substantia- 
tion to Theorem 5. 


Simple Triol-and-Error Learning by Massed Trials 

The distinction between massed and distributed trials in learning 
IS one of degree. If a period of 24 hours should intervene between 
trials the learning would surely be called distributed, and if a 
period of 30 seconds should intervene the learning would be called 
m^se . owever, there comes an intermediate point at which it is 
of vi!^ ^ which term is more applicable. From one point 

and maw dividing line between distributed 

ttocr.™ by a 

decreased (II RW ’“g'h for the stimulus trace (s') to have 

tributcd^rialsTre ^ only important to note that dis- 
trials, but not loncTnoTh* ’°"8or intervals than are massed 
follOTO from this tLt mofe forgetting. It 

a’cd by each reaction (IX 
during distributed trials than H 

that during and at th ♦ • massed. It follows further 

"■m be pr^nt than “f "“^“od learning more Ib 

trials. From this and IX termination of distributed 

TUEonns, 6 T,Z ' "" ^ 

"ror learnirtg will show ^^^Tnination of simple IriaUand- 

reaction potential if tC ^h^laneous increase {reminiscence) 

II learned by distributed ^rials°^ ^^cirned by massed trials than if 

Bj logic analogous to th 

^ present throughout that more Ib will 

or leas-masscd trials. From thf. ^ than by distributed 

n.assrf trials more condUion^T„t-S°™“"^ ■' ">=“ by 

tlmn by distributed trials. Bu^ '"'“bn.on (,I,) „ill be generated 

*" “ i« + .u. 
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and-error learning, a special form of learning by massed trials. 
By this procedure the organism remains continuously in the pres- 
ence of the competing manipulanda. For this reason the timing of 
the reaction evocations is more or less irregular since they depend 
upon the changing conditions within the animal’s body as mani- 
fested by behavioral oscillation (XII A) and other factors. Let us 
now examine the problem of how trial-and-error learning can occur 
under such conditions. 

The question is a historical one. A number of earlier writers, such 
as Hobhouse (2, p. 174), Holmes {4, p. 166), Thorndike (77, pp. 188 
ff.), Watson (75, p. 262), and KofTka (75, pp. 1 58 ff.) have struggled 
with it. Posed most sharply by Watson, Thorndike, and Ko^a, 
the situation is presented by our Case IV except that under the 
conditions here considered learning occurs by continuous trials. We 
assume that in this case bEr* = l.O^r and bEr. — S-Otr at trial 1. 
Now it is evident that under such conditions R_ ordinarily will 
occur without reinforcement quite a number of times before Re- 
takes place and receives its reinforcement. Let us assume five 
continuous repetitions of R-, after which R^. occurs, i.e., 

R_ R_ IL- R_ R_ R+. 

Watson argued that because R+ occurred at the last of the series 
immediately preceding reinforcement it would receive stronger 
reinforcement than would R_; Thorndike and Koffka argued truly 
that in a situation such as we have assumed, R_ would really 
occur more times than R+, and that by the supposed “law of use” 
would receive a greater increment to its learning. 

This condition requires us to make use of a principle not hitherto 
cited in this chapter, namely, the gradient of reinforcement (iii B). 
According to this principle, the greater the time which intervenes 
between an act and its reinforcement, the smaller will be the AbEr 
which results from the reinforcement. We may represent this 
roughly by the equation, 

J = 10-'*, (35) 

where j has a tentative value, here taken as .163. From these con- 
siderations and equation 35, it follows that, 

J ^ bEr, - 6.0<r X (36) 
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2. Holland (3) gave simple trial-and-error training to 45 albino 
rats by distributed trials, and similar training to 45 rats by massed 
trials. The advantage of the distributed-trials group is evident from 
an inspection of Figure 8. This gives empirical substantiation to 
Theorem 7. 


3. Kimble (74) trained 50 human subjects on an upside-down 
printing task by massed practice, and 46 comparable subjects 
On the same task by spaced practice. Six other groups of about 60 
subjects each were trained for 5, 10, 15, 20, 25, and 30 minutes 
y masse practice, after which they were given 10 minutes’ rest 



in an “ ‘'““a" '"'’J'' 

f *" "1“' nn-b” ’=>' "“"i pnactiap. Tha Ugh, Una. ,h< 
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Sraph, reproduced as n^ore minutes. Kimble 
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Original Summated 


bEr J^s A proper Generalized A Net result (4-), (-^) 

AsEr, -f5.0 +.155 -.100,-3.246 +.172 +3.903 

AsEr^ +1.0 -.100, + .945 -.658 +1.188 


A glance at this table -will show that the erroneous tendency 
(bErJ has in this trial alone shifted from 5.0a downward to 3.903cr, 
a learning gain of around 1.1 (t; and that the correct response 
tendency (sErJ has shifted from 1.0 upward to 1.188cr, a gain of 
around Thus on both counts the organism will be better 

prepared to survive, especially if tested again before the Ir dissi- 
pates. In this connection we hasten to point out that owing to the 
complexity and novelty of the above computations, together with 
our lack of knowledge of the numerous parameters involved, no 
special significance should be attached to the values secured, 
though the procedure should explain in a concrete manner the 
general nature of the theory. Actually, of course, if we count each 
of the false responses here involved as a separate trial the gain is 
not so very much more than was yielded by the same number of 
acts (Table 3, p+ column) by the distributed-trials procedure in a 
somewhat similar situation. 

Generalizing on the above considerations, we arrive at the follow- 
ing theorem: 

THEOREM 9. Simple trial-and^error learning by the continuous- 
trials procedure will eventuate in positive learningy the end result being 
much the same as by the distributed-trials procedure. 

That continuous-trials simple trial-and-error learning really 
occurs without difficulty hcis long been known empirically. The 
only question historically is how to explain, i.e., deduce it. It 
would appear that Watson was on the right track with his observa- 
tion that reinforcement is associated with the last response of each 
response scries, though at that time without the knowledge of the 
gradient of reinforcement and especially of the accumulation of In 
it was impossible to perform the deduction. 

Response Alternation in General Where Two Reaction Potentials Are in 
Repeoted Competition 

Tlic preceding pages have had much to say concerning the momcn- 
tar)' oscillation of reaction potentials. But the momentary oscilJa- 
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where bEb^ represents reaction potential due to delay in reinforce- 
ment. 

Assuming that each of the above listed R_’s requires an average 
of two seconds for its performance, there would be delay intervals 
of 2, 4, 6, 8, and 10 seconds before the several erroneous reactions 
would receive reinforcement. Substituting the t values one at a 
time in equation 36 and solving, we secure the following reaction 
potential values for the several R’s at the limit of practice: 


Delay in reinforcement in sec. 
bEr_ at the limit of practice 
^sEr. at the first Uial 


.1404 

.0044 


.2982 

.0094 


6 " 

.6311 

.0199 


4 " 

1.3368 

.0421 


■ 2 " 
2.8326 
.0892 


already has 5.0 

.ts possible 6.0.) one at a time as M's in equation 37, 


(37) 


AsEr = M(1 — 10“®®!), 

" Now tte i in abole table, 
provisional reaction not" succession, according to our 

155.. summation equation (v), amount to 

also generate A™raccorf“g m k”d* 

Ala = 5.0(1 — 10“«81XS\ 

A1 “ ^ 

this, combined with the inn utilized above, 20 per cent of 

generalizes to AsEr^. work (W), amounts to .658 and 

On the other hand ti, • .. 

20 P„ = 0) of .E„. amounts 

addmen, a small amount'o t g“'--aUaes to A,E._. In 

(W) of executing 7 o.,,,, *’ •'OO'r, results from the work 

observing appropriate’ sienr^^a""® “''"O various increments. 
fl®™”’;'‘'""’ations (+ and™ t the algebraic 

find the following: (^ee Corollaries v and vii) 

' This work-produced !» th 

'tSsrdinnUorX ' b",. included in the trial-and- 
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*«umed to be due trthe '«nahidcr of the i repository simplicity it 
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and 



From equations 38 and 39 we have, 

“ 1.00 - p' 


(39) 


(40) 


For example, if the probability of the occurrence of R+ is .75, then, 

1 

' " 1.00 - .75 

_ 

.25 

= 4.00. 

Similarly, 

p' _ 

1.00 - .25 

_ J_ 

" .75 
= 1.333. 


In this case the total mean alternation cycle (B') would be, 

B' = f; + f; 

= 4 + 1.333 
= 5.333. 


At this point it must be noted that the formulae so far considered 
yield the mean number of p or q events in uninterrupted sequences 
such as occur in dice throws where (1) the number of throws in- 
volved is assumed to be infinite, and (2) the values of p and q 
remain constant throughout the series. Neither of these conditions 
is found in empirical trial-and-crror learning situations. Such 
series rarely exceed one or two hundred trials, and usually do not 
exceed fifty; the theoretical example worked out for Table 5 (p. 46) 
has only 18 trials, and the median N in the empirical investigation 
discussed above (Figure 7) was 39. It is evident that a sequence of 
alternative events which is short will not have as great a mean 
value of F' as a sequence which is longer, because a marked 
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tion principle does not prevent the occurrence of appreciable 
sequences of one reaction to the exclusion of the other. 

Preliminary to the study of these phenomena as related to simple 
trial-and-error learning, it will be well to explain the employment 
of certain useful terms. Perhaps the most fundamental concept in 
this complex of relationships is that of response alternation. An 
alternauon is said to occur when one type of response shifts to the 
ot er, , on the next occasion. For example, in the response- 

sequence fragment: 


R+ R+|R_ R_|R+... 

Our sernnH^^^ alternations, each marked by a short, vertical line. 
allmatim directly from the first, is that of the 

1ZT2 ::' of ^eactiLns falling be- 

™mple ^rfirsTr ^'^rnations. Thus in the above 

three R. I’s. alternation phase represented contains 

alternation cycle alternation cycle. An 

successive altLatb.; ph:::s7X°'r"^^ 

cycle of 3 + 2 or five r • * • above example, an alternation 

vertical lines. Vinallv ^"closed between the two heavy 

there is the ♦u.^ a q£ 


vertical lines. Finally th*^ enclosed between the two heavy 

fesponse cycle: this Y r ** concept of the asymmetry of the 
there may be more reactin ^ behavior cycle 

other. Thus in the above alternation phase than in the 

"tciry is indicated hv alternation cycle, asym- 

responses. ./ ^hat the first ntmr:*. .-r,r,»o;T,s three 


"tciry is indicated bv th#. r V ^ alternation cycle, a 

fesponscs, whereas the seconrf^yl contains t 

>n the theory of chance it is r “"‘uins only two. 
certainty of the occurrence nr *** ““'ury to represent the complete 
pm ability less than certaintv h" “‘f “y Icnown 

heads on any single coin toss’^^ ‘’''"mal. Thus the chance of 
c -50 also. In cases of this sort wif °f is 1 .00 - .50 

or t)""'’ P"'hability of one “re involved in the 

uf 'he other event is usually "au"d“' ^““lly called p. and that 

I ^ »crnni,on m an infinite. , run will occur with- 
q’ *^umber of continuous trials is 
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certain things about it. This all means that until a more adequate 
mathematical analysis of the purely chance situation is attained, 
only rather general conclusions may be drawn concerning the 
behavior of organisms in actual simple trial-and-error learning 
situations. However, no radical error should result in our tentative 
examination; at least it will serve to open the problem to theoretical 
analysis. 

A convenient index of the nature and degree of response-cycle 
asymmetry (Y) is the quotient obtained by dividing the difference 
between the number of reactions in the respective phases of the 
response cycle by the total number of responses in the cycle; to this 
quotient is affixed the sign of the phase containing the greater 
number of responses. Stated formally, this index becomes, 


Y = 


Fp - F, . 
Fp 4" Fq 


(42) 


Thus in the example considered above (p. 43), assuming that p 
represents probability of the correct reaction, we have, 


Y == 


4 - 1.333 
4 + 1.333 
2.667 


+ 


5.333 


Y = +.50. 


The meaning of the above concepts may be further illustrated 
by the well-studied laws of the outcome of the successive tosses 
of a single coin. The theory of chance shows that in the long run 
the average number (F) of successive heads before a reversal is two, 
and the same is true of tails. This of course yields a mean cycle, 
where the number of throws is unlimited, of 2 + 2 = 4 reactions. 
Finally, since the two phases arc equal, the asymmetry (Y) will be; 


- :00 
~ 4 
= .00; 

i.c., the theoretical head-tail cycles in coin tossing arc perfectly 
symmetrical. 
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limitation in the length of the series as a whole will necessarily 
exclude from the values to be averaged some very long uninter- 
rupted sequences of both p and q events. The mean number of 
uninterrupted p events (Fp) in a limited series of N events is given 
by the provisional equation,® 


(41) 


■ L (f; - 2)f;»+‘ j 

Let It be supposed, for example, that p = .95 and N = 18. By 
equation 40, 




F' = 


1 

1.00 - .95 

J_ 

.05 
20 . 


assumed situation!* be quite impossible in the 

i.c., N = 18 Nevr events in the entire series, 

have, * ^ stituting these values in equation 41, we 

F. = 2ofl _ (20 - l)-*'’-^ - n 

L tzu - 2)20‘>+‘ 

■■ _ 1 1 


= 20 [l 

, I 18 X 20>»J 

- 20(1 - .3982) 

- 20 X .6018 
= 12.036. 


It ^rill be recalled th 

supposition that p docs based on th 

tuean alternation phase ralr, ^uns that a theoretica 

“d-crror process really r?’^!':'* [” ‘>'= '>''‘^'=t of a simple trial 
* * '^'^^'■Oon-cvocation nmc " c ^ niean length of such a phas 

considerable nuinbcr'^oft?i*°“''* “"‘‘nue unchanged for ; 

and rather “ngle point. Actual!; 

in Tables 1 and 3 T, f n* ^ ^uder certain conditions a 
r^m equation 41 beats onlv , “f F. calcnlatet 

>« <hts function is not ifno" '’PP"'™imation to the true F» 
' Thi! and other forna i ^ough it is possible to stati 

*’yAlfredW.jjJ°^;;’^'««Ployodi„u.cpr«e t • 

’ Auguit, 1943, ^ »ecuon were derived for use her 
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certain things about it. This all means that until a more adequate 
mathematical analysis of the purely chance situation is attained, 
only rather general conclusions may be drawn concerning the 
behavior of organisms in actual simple trial-and-error learning 
situations. However, no radical error should result in our tentative 
examination; at least it will serve to open the problem to theoretical 
analysis. 

A convenient index of the nature and degree of response-cycle 
asymmetry (Y) is the quotient obtained by dividing the difference 
between the number of reactions in the respective phases of the 
response cycle by the total number of responses in the cycle; to this 
quotient is affixed the sign of the phase containing the greater 
number of responses. Stated formally, this index becomes, 


Y = 


Fp + F, 


(42) 


Thus in the example considered above (p. 43), assuming that p 
represents probability of the correct reaction, we have. 


Y = 


= + 


4 - 1.333 
4 + 1.333 
2.667 


5.333 


.-. Y = +.50. 

The meaning of the above concepts may be further illustrated 
by the well-studied laws of the outcome of the successive tosses 
of a single coin. The theory of chance shows that in the long run 
the average number (F) of successive heads before a reversal is two, 
and the same is true of tails. This of course yields a mean cycle, 
where the number of throws is unlimited, of 2 + 2 — 4 reactions. 
Finally, since the two phases are equal, the asymmetry (Y) will be: 

* 2 + 2 
_ 

■" 4 

= . 00 ; 

i.e., the theoretical head-tail cycles in coin tossing are perfectly 
symmetrical. 



jonse Alternation Characteristics of Simple Triai-and-Error learning 
th the preliminary analysis of the phenomena of the alternation 
rarely chance events in general before us, we may now proceed 
,ts use in the theory of simple trial-and-error learning. Since the 
oretical data of Table 3, when taken together, display prac- 
illy the whole range (i-e-, from beginning to end) of a typical 
;e of simple triaUand-error learning, wc shall take for our present 
rpose the probability values appearing in the last two columns 
that table. These values are reproduced as the second and third 
lumns of Table 5. 


. B L E 5. A table showing the progrcaive changes in the theoretical mean number 
of uninterrupted sequences (F) of R+. and R_ respectively in a case of distributed- 
trials simple trial-and-error learning, the changes in the length of the mean alterna- 
tion cycle (B) and in the asymmetry index (Y). 


Trial 
num- . 
ber 

Reaction 

probability 

Theoretical 
mean number 
responses per 
alternation 
phase, N ■ » 

Theoretical 

mean num- 
Theoretical mean ber responses 
number responses per aJtema- 
per alternation tion cycle, 

phase, N- 18 N « 18 

Theoretical 
asymmetry 
of alterna- 
tion cycle, 
n >• 18 

P* 

P- 

F'. 

Fl 

F* 

F- 

B 

Y 

1 

.017 

983 

1 02 

58.82 

.266 

15 612 

15.878 

-.966 

2 

056 

.944 

1 06 

17 86 

.680 

11.507 

12.187 

-.888 

3 

.124 

.876 

1.14 

8 06 

1.035 

7.305 

8.340 

-.752 

4 

217 

.783 

1 28 

4 61 

1.260 

4.547 

5.807 

-.566 

S 

.328 

.672 

1.49 

3.05 

1.486 

3.046 

4.532 

-.344 

6 

451 

.549 

1.82 

2 22 

1.821 

2,217 

4.038 

-.098 



.422 

2.37 

1.73 

2.370 

1.730 

4.100 

.156 


.697 

303 

3.30 

1.43 

3.294 

1.432 

4.726 

.394 



.204 

4.90 

1.26 

4.816 

1.234 





.132 

7 58 

1.15 

6.969 

1.060 


.736 



.085 

11.76 

1 09 

9.367 

.870 

10.237 



.945 

055 

18 18 

1.06 

11.593 

.673 





037 

27.03 

1 04 

13.292 

.507 





.026 

38 46 

1 03 

14.504 

.391 






52 63 

1 02 

15.353 

.292 






66 67 

1 02 

15.860 

.238 






83 33 

1 01 

16 258 

.195 



W 



100 00 

1 01 

16.520 

.166 

16.686 

.980 


ia in equations 39 and 40 

resnome a 'heoretical mean number of the R+ 

ponses and the R_ responses in the respective alternation pha^ 


TRIAL-AND-ERROR learning 

at the different stages of the learning process on the simple chance 
assumptions based on an unlimited series of trials. These are shown 
m the fourth and fifth columns of Table 5. The values are then 
converted by means of equation 41 to chance values based on the 
assumption of only 18 trials; these are shown in the sixth and 
^venth columns of Table 5. They are represented graphically in 
Figure 10. There it may be seen that the value of F+, beginning at 
.31, rises very slowly at first, then with great rapidity, after which 



*' I o u R E 10. Graphs showing the theoretical mean number of responses (F) per alter- 
nation phase as a function of the number of trials in distributed-trials simple trial-and- 
crror learning for p.j. and p_ respectively. Plotted from columns 6 and 7 of Table 5. 

the rate of rise considerably lessens. On the other hand, F_ begins 
vvith a large value and falls rapidly at first, after which its rate of 
fall becomes nearly linear. 

The values of the alternation cycles (the sum of the values in 
columns 6 and 7) as a function of the number of reinforcements are 
shown in the ncxt-to-last column of Table 5. They are represented 
Sraphicaily in Figure 11. A glance at this figure shows that at the 
beginning and end of a complete simple trial-and-error process 
where lU is decidedly dominant at the outset, the alternation 
cycles are relatively protracted, the minimum being reached at a 
point somewhat anterior to the middle of the reinforcements where 



46 


A BEHAVIOR SYSTEM 


Response Alternation Characteristics of Simple Trlal-and-Error Learning 

With the preliminary analysis of the phenomena of the alternation 
of purely chance events in general before us, we may now proceed 
to its use in the theory of simple triaband-error learning. Since the 
theoretical data of Table 3, when taken together, display prac- 
tically the whole range (i.e., from beginning to end) of a typical 
case of simple trial-and-error learning, we shall take for our present 
purpose the probability values appearing in the last two columns 
of that table. These values are reproduced as the second and third 
columns of Table 5. 

T A B L E 5. A table showing the progressive changes in the theoretical mean number 
of uninterrupted sequences (F) of R+ and R_ respectively in a case of distributed- 
trials simple trial-and-error learning, the changes in the length of the mean alterna- 
tion cycle (B) and in the asymmetry index (Y). 


Trial Reaction 
num- probability 
P* P- 

1 017 983 

2 .056 .944 

3 .124 .876 

4 .217 .783 

5 328 .672 

6 451 549 

7 578 .422 

8 697 303 

9 796 .204 

10 .868 .132 

11 915 085 

12 945 055 

^3 963 .037 

14 974 .026 

15 .981 .019 

16 985 015 

17 .988 .012 

18 .990 010 


Theoretical 

Theoretical mean num* Theoretical 

mean number Theoretical mean ber responses asymmetry 
responses per number resp>onses per altema- of altema- 
alternation per alternation tion cycle, tion cycle, 

phase. N - w phase, N ■ 18 N «» 18 n ■» 18 

F.» F- B Y 


1 02 

58.82 

.266 

15.612 

15.878 

-.966 

1 06 

17 86 

.680 

11.507 

12.187 

-.888 

1.14 

8 06 

1.035 

7.305 

8.340 

-.752 

1 28 

4 61 

1.260 

4.547 

5 B07 

-.566 

1 49 

3.05 

1.486 

3.046 

4.532 

-.344 

1.82 

2.22 

1.821 

2.217 

4.038 

— .098 

2.37 

1.73 


1.730 


.156 

3 30 

1 43 

3.294 

1.432 

4.726 

.394 

4.90 

1.26 

4.816 

1.234 

6.050 

.592 

7.58 

1.15 

6.969 

1.060 

8.029 

.736 

11 76 

1.09 

9.367 

.870 

10.237 

.830 

18 18 

1 06 

11.593 

.673 

12.266 

.890 

27.03 

1 04 

13 292 

.507 

13.799 

.926 

38 46 

1.03 

14.504 

.391 

14.895 

.947 

52.63 

1.02 

15.353 

.292 

15.645 

.963 

66 67 

1.02 

15.860 

238 

16.098 

.970 

83 33 

too 00 

1 01 

1 01 

16.258 

16.520 

.195 

.166 

16.453 

16.686 

.976 

.980 


in “taituted in equations 39 and 4( 

responses and the r'r number of the 

the responses m the respeetive alternation phase. 
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at the different stages of the learning process on the simple chance 
assumptions based on an unlimited series of trials. These are shown 
in the fourth and fifth columns of Table 5. The values are then 
converted by means of equation 41 to chance values based on the 
assumption of only 18 trials; these are shown in the sixth and 
seventh columns of Table 5. They are represented graphically in 
Figure 10. There it may be seen that the value of F+> beginning at 
.31, rises very slowly at first, then with great rapidity, after which 



r ■ o u R E 1 0. Graph! rhowlng the theorcUcal mean number of responses (F) per alter- 
nation phase as a function of the number of trials in distributed-trials simple trial-and- 
“TOr learning for p. and p. respectiveijr. Plotted from columns 6 and 7 of Table 5. 

the rate of rise considerably lessens. On the other hand, F_ begins 
with a large value and falls rapidly at first, after which its rate of 
fall becomes nearly linear. 

The values of the alternation cycles (the sum of the values in 
columns 6 and 7) as a function of the number of reinforcements arc 
shotvn in the nc.xt-to-last column of Table 5. They arc represented 
eraphically in Figure 11. A glance at this figure shotvs that at the 
beginning and end of a complete simple trial-and-crror process 
where R_ is decidedly dominant at the outset, the alternation 
cycles are relatively protracted, the minimum being reached at a 
point somewhat anterior to the middle of the reinforcements where 
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theoretical mean response cycle as a funcUon of the number c 
emforcements. Plotted from the next-to-last column of Table 5. 
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the cycle reaches a value of approximately four responses, as do 
coin tosses. 

The asymmetry of the alternation cycle has been calculated by 
means of equation 42 from values appearing in columns 6 and 7 
of Table 5; the Y values are presented in the last column. They are 
represented graphically in Figure 12. There it may be seen at a 
glance that the theoretical asymmetry of a complete simple trial- 
and-error process begins in the negative phase and rises with a 
positive acceleration to a zero value, after which it passes into a 
positive phase through which it rises with a negative acceleration, 
the whole presenting a characteristic sigmoid picture. 

Generalizing from the preceding considerations, we arrive at the 
following theorem: 

THEOREM 10. In simple Uial-and-enoT learning where is fairly 

strong but several a^s weaker than R^: 

A. The mean R.+ alternation phase (Fp) will be minimal at the 
outset of practice but will gradually increase with a negative accelera- 
tion to a maximal value as practice is indefinitely continued. 

B. The mean R_ alternation phase (F^) begins with a maximal value 
and falls at a positively and then a negatively accelerated rate to a 
minimal value. 

C. The alternation cycle (B) begins with a relatively large value, 
falls at first with a positive and then with a negative acceleration to a 
minimal value near 4, after which it rises again, at first with a positive 
and then with a negative acceleration, to a relatively large value. 

D. The index of asymmetry {¥) in Case IV of the complete simple 
trial-and-error learning process begins with a value approaching — l.OO 
according to the magnitude of the difference between R+ and R_, and 
then rises through zero to -{-1 .00, following a sigmoid course. 

Comparison of Theoretical with Empirical Phenomeno 
of Response Alternation 

Wc may now proceed to the consideration of tlie empirical validity 
of certain of the theorems derived above regarding tlie response 
oUernation aspect of simple trial-and-crror learning. The empirical 
cv'idence, for the most part, comes from a single experiment on 
continuous-trials learning of this type (5). In this experiment 
albino rats were giv'cn one day’s training in operating the vcrtic.il 
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bar shown in Figure 13, the horizontal bar being retracted. The 
animals were allowed to make only 15 reactions, each reaction 
being reinforced by a small cylinder of hard but appetizing food. 
Twenty-four hours later the vertical bar was retracted and the 
horizontal bar was introduced. Sixty manipulations were evoked 
on the horizontal bar, each being followed by the same type of 



simple trial-and-cn-or Icarnintr bv co f experiment concerned with 

mg the trial-and-crror process I .. pushed to the left dur- 

the tood-dish, K, but when the horilnlT^e ““ 

delivered ( 9 , p 237). pressed downward, no food was 


da^th^Sready^ton/w .''T 

reinforcements, and then the'an'^ ' '''**’** 

as shown in Figure 13 the ann presented with both ba; 

of .he verticailar - *a. the operati, 

training. This is substantfaily the et T" “ 

Case IV (pp. A total of IM anfrr =‘b°''o 

group or another in this experiment omployed in o: 
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From this procedure the following empirical facts were reported: 

Of 76 animals which mastered the trial-and-error learning (P), 
all displayed one or more response-alternation cycles, the median 
animal responding in three and the extreme animal in 15 alterna- 
tion cycles. This is in general 


accord with Theorem 10 G. 

Of these 76 animals (P), 25 
gave a total of four, five, or six 
complete response - alternation 
cycles. The scores of each of the 
first and last four alternation 
phases of these 25 records were 
averaged to secure a somewhat 
exaggerated effect. The results of 
this operation are shown graphi- 
cally in Figure 14. There may be 
seen a fair approximation to the 
theoretical Figure 10, in that as 
practice continues (1) R_ shows 
a progressive and generally neg- 
atively accelerated fall, and (2) 
R+ shows a progressive and gen- 
erally positively accelerated rise. 
This is in substantial agree- 
ment with Theorem 10 A and B, 
though there may be seen some 
discrepancy at the upper level of 
each curve. It is believed that 
this discrepancy is due to the fact 
that the trial-and-error process 
did not begin at an early enough 
stage and also was not carried far 



FIGURE 14. Graph showing the mean 
number of reactions in each alternation 
pha>e of four alternation cycles made by 
25 animals, each of which gave four or 
more response-alternation cycles during 
a single simple trial-and-error learning 
process by continuous trials. Plotted from 
data from Hull ( 9 , p. 252). 


enough to show its full effect. 

The data represented in Figure 14 have been combined into 
alternation cycles and plotted as Figure 15. This shows a clear 
tendency for the alternation cycles to be high at the beginning and 
the end of the simple trial-and-error learning process, though due 
to the fact that the trial-and-error learning was not carried to an 
advanced stage the data for these figures do not give much oppor- 
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tunity for the long cycles at the posterior end of the process to 
manifest themselves. Thus the theory as presented in Theorem 10 G 
and Figure 11 is confirmed in the main, though not in complete 
detail. 

Next we consider the asymmetry of the four mean empirical 
alternation cycles which appear in Figure 15. These asymmetry 



values are shown graphically in Ficure wK.=. u • 

relevant to Theorem 10 n Lh •*'<= P'aph is 
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*' J o u R E 16. Empirical graph showing the rbe in index of asymmetry of mean alter- 
nation cycles. Calculated from the data represented in Figure 14, 

though this may have been due in part at least to the fact that 
the trial-and-error process was not carried far enough in the 

experiment. 

Summary 

This chapter presents two deductions of very general application. 
One deduction is to the effect that increments of both oEr and bIr 
generalize to situations having stimuli which arc relatively similar 
and responses which usually involve substantially the same muscle combina- 
tions. Recently, empirical evidence to this effect has become known. 
The other theorem states the deduction that massed trials in simple 
trial-and-error and similar learning arc less effective than arc dis- 
tributed trials, a fact long known empirically. 

Simple trial-and-error learning itself takes place in normal 
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mammaliaa organisms when they are presented with a stimulus 
situation that either through the organism’s inheritance or previous 
learning, or both, tends to evoke two or more distinguishable reac- 
tions, of which only one receives reinforcement. In case the com- 
peting reaction potentials are two, and both are weak but equal in 
strength (Case I) there will be first a more or less irreguly alterna- 
tion between them, the erroneous one gradually becoming weak- 
ened by experimental extinction and the successful one being 
gradually strengthened by reinforcement. The increase in strength 
of the correct reaction potential is generally sigmoid in nature, 
beginning with a positive and ending with a negative acceleration. 
The erroneous reaction potential falls with a negative acceleration 
followed by a mild rise due to the geneTah 2 ation of the telufotcc- 
ments which are occurring. These processes jointly generate a 
rather steep, negatively accelerated probability-of-rcaction-evoca- 
tion curve of learning which begins with each R at about 50 per 
cent and reaches perfection (100 per cent) without the correct 
reaction potential rising anywhere near its physiological limit and 
without the erroneous reaction potential at the end standing very 
much below its original level. 

When simple trial-and-error learning begins with two equal reac- 
tion potentials which are both relatively strong (Case 11), the same 
general situation as that just described results, though there is a 
reduction in the amount of growth of the correct tendency, an 
increase in the amount of fall in the erroneous tendency, and more 
trials are necessary for the correct tendency to attain complete 
dominance. 


Perhaps the classical form of simple trial-and-error learning is 
found where the erroneous reaction potential is strongly dominant 
at the outset of the process and where the correct reaction potential 
IS much weaker but well above the reaction threshold (Case IV), 
The course of the correct reaction potential is at first slightly down- 
ward as the result of gencraliaed extinction effects; at length, 
however, from the increase in the proportion of the reinforced 
trials It begins to rise with a positive acceleration. This rise near the 
end of the process tends to pass over into a negative acceleration. 
M in Cases I and II, the fall of R_ is negatively accelerated at the 
beginning and this fall is about the same in extent as the rise of the 
correct reaction potential. Late in the process the fall gives place 



55 


TRIAl. AND. ERROR lEARNING 

to a slight rise due to generalized AsEb/s, The probability of correct 
reaction evocation which results from the competition of the two 
processes in this case shows a very dearly marked sigmoid curve of 
learning. It is thus evident that the probability “curve of learning,” 
even m the same type of learning process, is not constant but 
that the form is dependent upon the conditions under whieh 
the learning occurs, in this case the sEr_ points at which learning 
begins. 

In Case V, which is the same as Case IV except that R+ is only 
a little above the reaction threshold, the generalization of extinction 
effects from R_ may depress the correct reaction potential below 
the reaction threshold before the two potentials get close enough 
together for the oscillation tendency to bring about any evocation 
R+. In such an event, of course, R^. ordinarily will never be 
evoked and the organism will fail in adapting to the situation. 

"The theory of probability shows that dominance alternation in 
tbe case of two competing reaction potentials in a simple trial-and- 
error situation contains longer or shorter runs of one reaction 
potential to the exclusion of the other. Analysis reveals that at the 
Outset of Case IV, our most general form of simple trial-and-error 
earning, (1) the negative phase of each alternation cycle is, on the 
average, considerably longer than the positive phase; (2) as the 
earning progresses the two become about equal; and (3) as the 
earning approaches perfection the positive phase becomes much 
t e longer. In a parallel manner, at the outset of such learning the 
mean alternation cycle as a whole is relatively long; it falls to a 
minimum near the middle of the process, after which it rises to 
mdefinite heights with continued practice. 

A comparison of the theorems derived from the present system 
'''ith the empirical evidence now available reveals an extensive 
of rough approximation which tends to support at least 
^ c main features of the postulates employed. However, there is a 
notable lack of empirical evidence regarding the separate Icarnuig 
curves of R+ and R_, though this could easily be supplied (XIV; 

y the latencies of these responses if accurately taken under t e 
mstributed-trials technique. Careful quantitative work in this field 
! *^uld lead to a precise determination of the “constants invo ve 
m the equations as dependent on various conditions, and may lead 
® important revisions in the theory itself. 
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Terminal Notes 
ADDITIONAL FORMS OF TRIAL-AND-ERROR LEARNING 

In addition to the forms of trial-and-error learning discussed above, 
there may be mentioned three others. The first of these occurs in the 
situation where the amount of reinforcement of the negative reac- 
tion is not negative but is positive, though less than that yielded 
by the positive reaction. An incidental treatment of this type of 
learning has been given elsewhere (77, p. 146) in connection with a 
consideration of the effect of differential delays in reinforcement. A 
second form of simple trial-and-error learning which has a rather 
similar mechanism and outcome is observed in case a differential 
amount or quality of the reinforcing agent is yielded by each of two 
competing reactions. 

A third form of simple trial-and-error learning occurs where 
each of two competing reaction potentials receives exactly the same 
reinforcement, but the work (W) involved (77, p. 294), or the 
punishment received, incidental to the performance of one of the 
reactions is less than that incidental to the performance of the other. 
In this case there develops a greater amount of inhibition in con- 
nection with the reaction involving the greater amount of work, 
which neutralizes at least a portion of the potential leading to this 
reaction and thus leaves a differential advantage in effective 
reaction potential in favor of the reaction involving the less work 


HISTORICAL NOTE CONCERNING THE CAUSE OF 
ALTERNATION CYCLES 

Th= discussion of the alternation cycle seems to have appeared in 
wave =“*=•1 time being conceived as afairly regular 

Tea inn sZT rTT"'* "P™ process as the in- 

Sow W ?h ‘ gradually crosses a static reaction 

inreshold with increased trials (S\ Th?o • • u j i_ t. 

implied an alterLtion\ycle™f‘Srnrtr°T"’ 
have seen that the cycle ac°LS 'J®*’ 
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investigated in their relation to the influence of caffeine citrate. 
No new phenomena specifically concerned with the length or the 
asymmetry of the alternation cycle were reported, though it was 
found that the cyclic phenomena extended over a wider range in the 
central part of the series than at the ends. 

In 1939 an empirical study of animal trial-and-error learning (P) 
showed for the first time the progressive fall in the negative (extinc- 
tion) alternation phase of the alternation cycle and the progressive 
rise in the positive (reinforcement) phase of the cycle, and, by 
implication, the tendency to a minimal value of the cycle length 
in the region of zero asymmetry. 

The second theoretical attempt in this connection was published 
in 1930 (6). This explanation of the phenomenon of behavioral 
alternation in trial-and-error learning was derived solely from the 
extinction of R_ to a point such that R^. could be evoked. The 
recurrence of R_ later was attributed to the spontaneous recovery 
of R«. from this extinction. In 1 936 substantially the same hypothe- 
sis was presented, though in a somewhat more formal manner {8, 
p. 20, Theorem IV). 

In 1940 a still different hypothesis was put forward as to the 
nature of behavioral alternation in connection with a general 
theoretical consideration of rote learning (73). By this hypothesis, 
behavioral oscillation was postulated to be a function of varying 
resistance to reaction evocation at the reaction threshold. 

In connection with a detailed behavioral systematization (77, 
pp. 304 ff.), there was presented in 1943 a modification of the 1940 
hypothesis as to the nature of behavioral oscillation; namely, that 
it is a function of habit strength, that all habit strengths oscillate 
independently, and that the reaction threshold is static. No specific 
application was made at that time to alternation cycles as such. 
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3. Discrimination Learning 


At the outset of our consideration of the subject of discrimination 
learning it will be well to clarify our use of certain terms. It is 
especially important to distin- 
guish simple discrimination 
learning from simple trial-and- 
error learning,^ which was dis- 
cussed in some detail in the 
preceding chapter. 

The distinction can be made 
perhaps most effectively on the 
basis of the stimulus-response 
relationships involved. Let it 
be supposed, for example, that 
each of two stimuli, Si and S 2 , 
has the capacity to evoke a par- 
ticular reaction, Ri, as shown 
in Figure 17. Because Si and S 2 
are dynamically equivalent in 
so far as the evocation of Ri 
is concerned, this relationship 
may be said to be that of stimulus equivalence. Let it be supposed fur- 
ther that when Ri is evoked by Si (Sj being absent) the situation 
will be such that reinforcement will follow, but that when Ri is 

* A portion of Chapter 3 appeared nearly tabalim, in the Psjicfielogual Perifw, 1950, 
57, 303-313. 

•Tills distinction has been emphasized by Spence (f7, pp. 429-430). TTic present 
chapter is essentially the writer’s Interpretation of Spence’s extension and formalization 
(f7; JS; 19) of Pavlov’s analysis {IS, pp. 117 ff.) of discrimination learning. 
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FIGURE 17. Diagram showing the type of 
stimulus-response situation which precipi- 
tates simple discrimination learning. Be- 
cause the reaction potentialities at the out- 
set converge from Si and Ss upon Ri, this 
u called the convergent S — ♦ R situation. 
For a contrasting diagram of the divergent 
S— * R situation which is basic to simple 
trial-and-crror learning, recall Figure 1, 
p. 19. 
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resulted as usual, but when Ri followed the presentation of the 
white card, no food was ever given. This differential reinforcement^ as 
it is called, gradually caused a differentiation in the response 
intensities to the two stimuli, as shown in Figure 1 9, beginning at 
day 1 and continuing up through day 48. This represents the 



FIGURE 18. A drawing of the apparatus utilized in the study of simple white-black 
discrimination. The albino rat is placed in the chamber beneath the transparent lid 
marked L', which is shown aj closed. When the animal is facing the shutter (S) ready 
to go into the next chamber, the experimenter lifts S somewhat more than enough for 
the rat easily to pass through into the chamber beneath the lid, L, shown as open, the 
shutter being suspended in this position by the hooked rod, H. Just as the shutter rises 
high enough for the animal to pass through, the shoulder C depresses the spring con- 
tact C', starting an electric laboratory clock recording time in hundredth seconds. Next, 
the animal pushes beneath the sloping cardboard door (D) to get the food, F. When the 
door is raised one inch the microswitch (M) stops the clock, which then shows the re- 
sponse time of the subject. The white or black stimuli to be discriminated were placed 
on the side of the door faced by the lat when ia the chamber beneath lid L. Reproduced 
from Wilcoxon, Hays, and Hull (,22). 

peculiarly discriminatory learning. It should be observed that on 
and around day 45, Si (black) evoked a reaction potential of 
approximately 4.5(r. In this connection we note that the reaction 
potential evoked by Sz first increased with Si, then fell very gradu- 
ally as differential reinforcement progressed. 

But how is this differentiation between black and white dis- 
tributed among the five shades — black, dark gray, middle gray, 
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evoked by S2 (Si being absent) reinforcement will in no case 
follow. Under these conditions Si -» Ri will be progressively 
strengthened by reinforcement, and S2 Ri will be progressively 
weakened by experimental extinction, until at length Si will uni- 
formly evoke Ri and S2 never will; this latter constitutes the state 
of perfect simple discrimination. 

In summarizing the contrast of the two types of learning just 
considered we may say that they arc alike in that they involve the 
selective strengthening of one (the adaptive) receptor-effector 
connection rather than some other (the unadaptive) receptor- 
effector connection. They are distinguished by the fact that in 
simple discrimination learning the receptor-effector connection 
selected differs on the stimulus side from the one eliminated, whereas 
in simple trial-and-error learning the receptor-effector connection 
selected differs on the response side from that which is eliminated. 
By extending the meaning of the word selection a little, we may say 
that stmple discrimination learning involves primarily stimulus selection, 
whereas simple trial-and-error learning involves primarily response selection. 

A Concrete Example of Simple Seporote-Discriminanda 
Presentation Discrimination learning 


In order that the reader may secure an appreciation of the phe- 
nomenon the theory of which we are about to consider, we now 
present a simple concrete case of such learning. This consisted, 
rrsnon", ^ “f = locomotor and door-lifting 

ma^rnorf ® (S.) which constituted the 

rproarn "-ff T W -- =»bl"o rat 

hlT V ^ chamber beneath lid L. The response to 

ponLroitT '‘’■'“’r --fo-etnent. A cZe of th^ 

approximately up to its 
values on the Japh wer j*PP="^ ■" Figure 19 . The .E„ 

median rcsponsflatmcv of ^r'' '^=!'=™luation of the 

calculation of the equivalent '‘g**' animals and then the 

of the ,t„’s in equation 28 . ' substitution 

When the animals had learned , 
they were presented with an irrm t *= la'aok card 

Mack (S.) and a white card (S,) 

(ROfol.owedthepresentatlo„„aHeb,ackc\r“fXrr^^^^^^ 
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light gray, and "white? In order to determine this experimentally 
Antoinetti tested a group of eight rats, which were similar but 
had been trained somewhat differently, for their reaction potentials 
on the five shades of the series. To do this the differential-rein- 
forcement procedure was continued in the main except that among 
each subject’s sixteen daily response-evocation trials there were 
given on each of six consecutive days two “test” trials on dark 
gray, middle gray, and light gray. This technique yielded from each 



F I o u R E 20. A rough average post-discrimination generalization gradient secured by 
Antoinetti on the black-white continuum after differential reinforcement, in terms of a 
human j.n.d. scale of brightness. The animals differed considerably in their learned 
powers of discrimination. Reproduced by permission of John A. Antoinetti (7). 

subject two scores on each gray; in addition, of course, several 
additional scores on black and white were obtained on these days. 
The order of presentation of the six critical test trials was varied 
with the different subjects in such a way that no shade would, on 
the average, be presented earlier or later than the other two. Since 
none of the special test stimuli was reinforced, it is assumed that 
all three of them would be slightly depressed by the incidental 
non-reinforcement. 

We note also that the Munsell coated papers which Antoinetti 
used as stimuli arc very carefully graded psycho-physically so as 
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We shall call these non-continuum stimulus elements the incidental 
stimuli. They will be represented as a whole by S3. The learned 
attachment of S3 to Rj obviously sets up a separate reaction poten- 
tial which operates wherever S3 occurs, and S3 always accompanies 



FIGURE 21. The lower portion of this figure shows a graphic representation of a 
theoretical generalization gradient on a qu^itativc stimulus continuum plotted in 
terms of j.n.d. differences. Above is shown the same gradient after it has been combined 
(+) with a reaction potential of 3.5<r (dotted horizontal line) assumed to be caused by 
incidental stimuli (S3). Note that this summation greatly distorts the generalization 
gradient (a) by moving it upward leaving its asymptote at the level of 3.5<r, and (b) by 
greatly reducing its slope. The value of M is represented by the upper horizontal line at 
6.0(r, and that of the reaction threshold is represented by the broken line at .355<r. 

the generalization continuum as used in the experiment. This 
means that the S3 stimuli by themselves will evoke Ri whenever 
Si or S2 are presented, quite apart from the latter. Thus, so far as 
S3 is concerned, this combination of circumstances gives a super- 
ficial appearance of 100 per cent generalization. 
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to have equal differences in terms of “just noticeable differences” 
(i.n.d.’s) as seen by the human eye. This may be called the quali- 
tative” or subjective scale of brightness which is the traditiona 
approach. In addition we shall later (p. 78) consider the quanti- 
tative or objective scale of brightness in terms of per cent physical 
reflectance of the papers. The resulting post-discrimination gen- 
eralization gradient of this psychophysical scale is shown in Figure 


We shall now proceed to derive a theoretical account of the 
empirical phenomena just presented as an illustration of the 
qualitative approach. 


The Special Role of Incidental Stimuli (Sj) in Discrimination Learning 

The most important type of discrimination learning is based on 
stimulus generalization (70; 6; 5). Present available evidence indi- 
cates that the gradient of stimulus generalization takes the general 
form graphically represented in the lower portion of Figure 21. 
The phenomenon of stimulus generalization as employed by Pavlov 
and many other writers (75; 77; 18) is based on a continuous series 
of potential stimuli, and for this reason this series potentiality is 
called a stimulus continuum. Such a continuum could be the series 
of sound (pitch) vibrations varying from high to low; of colors 
ranging from deep red to orange; of light intensities extending from 
a strong illumination to one near zero in amount; or of cutaneous 
vibrations at a constant rate varying from a strong intensity to a 
weak one. The continuum utilized by Antoinetti in the example 
given above included Munsell coated papers ranging from a white 
having 78.66 per cent of the reflectivity of magnesium oxide to a 
black having 1.21 per cent. 

^ During learning days I to VII in Antoinetti’s experimental 
situation, it is obvious that in addition to the blackness of Si (i.e., 
of the generalization continuum), many additional or incidental 
stirnuli become conditioned to Ri. These include other stimuli 
w 1 C consistently impinge on the organism’s sensorium during 
the repeated reinforcements; for example, the sound of the click 
when the sheet-metal shutter (S) is lifted, the lights and shadows 
of the chamber which the animal passes through from shutter to 
c •’ may have been consistently present, and 

the infinitely complex stimuli conung from the animal’s own body. 
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this may explain why Pavlov’s (75, p. 118), Hovland’s, and Brown’s 
(2) generalization gradients were approximately horizontal on the 
first unreinforced trial utilized in the tests for generalization (see 
Figure 22, first trial). As a matter of fact, practically the same thing 
is shown by Antoinetti’s graph (Figure 19) at days 1 to 19 by the 
near equality of the Si and S 2 values, even though these are each 
based on eight differential reinforcements each day.® 

Next there arises a question of how, in case the incidental bEr 
nearly equals the value of M, the generalization gradient can ever 
become manifest. The answer is that its appearance results from 
the gradual removal of the incidental sEr by experimental extinc- 
tion. Frequently, however, Ss — ♦ Rv cannot be extinguished without 
the non-reinforced presentation of the stimulus continuum in some 
form. For example, in the Antoinetti experiment a door is required, 
and any door must be of some shade. We shall accordingly neglect 
for the present the bEr connected by stimulus generalization to S 2 , 
and concentrate on the sEr based on S3. 

At the outset of differential reinforcement, the incidental reaction 
potential of Sa —* Ri presumably is well along toward its asymptote 
(Figure 19) from the training of days I to VII. This implies that 
the extinction portion of the differential reinforcement training 
will start from near zero and presumably (22) will be advancing 
relatively rapidly. It follows that during differential reinforcement 
the s.Ir will tend gradually to overtake the b.Er and the super- 
threshold portion of the biEr will gradually approach zero, i.e., 
there will be a tendency toward,^ 

b.Er sJh = 0. (43) 

This means that the effective reaction potential under the control 
of the incidental stimuli will for most purposes gradually become 
relatively neutral and unimportant in the determining of overt 

* The relative slowness of Antoinetti’s S,’s in developing inhibitory potential from 
the differential reinforcement employed appears, from a comparison of his results with 
data reported by Rabcn (76), to be due in part to the number of non-reinforcements 
given with each reinforcement. It is emstomary to give a half dozen or more non- 
reinforcements to S* for a single reinforcement to Sjj and in general the more non- 
reinforcements to Sj, relatively the more prompt will be the differentiation. This fact, 
in turn, may be due in part to the short timc-intcrvali between trials (see terminal note. 
Chapter 4) and, in addition, to the fact that Si and Sj became secondary reinforcing 
agents during the original seven days of training when all stimuli were reinforced. 

* bEr " bEr bLr. 
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Now since both sets of stimuli. Si and Ss, involve the same 
response their reaction potentials must combine (+) with each 
other (77, pp. 222 ff.). Let us assume (IV) that the irrelevant (Sj) 
habit strength (bHr) taken by itself amounts to .58333, whereas 
the habit strength of Si-»Ri taken by itself amounts to .50. 
Assuming that M equals 6.0ff and (VIII) multiplying, we find 
that the reaction potential controlled by Si directly is .50 X 6.0, 
or 3.0a, and that controlled by Sj is .5833 X 6.0, or 3.5a. Combin- 
ing this constant incidental reaction potential of 3.5a with each 
point of the generalization gradient by means of summation equa- 
tion 1 1 , we secure the displaced and distorted gradient shown in 
the upper portion of Figure 21. 

Let us compare these two manifestations of the same gradient. 
The addition of the irrelevant b.Er of 3.5a artificially raises the 
asymptote of the generalization gradient by that amount. At the 
same time, summation greatly flattens the gradient, though the 
fractional amount of fall toward the asymptote of each gradient 
at each increment of d is constant and exactly equal in both, in the 
present case approximately 56.23 per cent of the preceding point. 
However, if the value of M were reduced from fi.Oa to S.Otr, the 
upper gradient would appear much less steep still, i.e., it would be 
even more distorted, and if M were reduced to 3.6a the gradient 
would be so flat as to be practically horizontal; empirically it 
would hardly be detectable. At bottom this is because near the 
asymptote m ordinary learning (4) additional practice adds little 
to the strength of the habit. 

From the preceding considerations, we arrive at our eleventh 
theorem: 


THEOREM 11 A. The stimulus generalization gradient as produced 
on a stmulus contmuum by simple learning is summated (+) with 

!y’leTm°ng^'’‘’"‘‘‘‘‘ 

of the o ortifiaally raises the apparent asymptote 

‘>-Padient As the incidental 

PlsZ ZlzZr^ 

roWaflrin'Ih “ 21. finds substantial cor- 

roboration in the gradients reported by Hovland (S; 9). Incidentally 
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While not duplicating the conditions of the theorem, two inde- 
pendent studies, one by Hovland (5) and one by Brown (2), show 
a clear increase in the steepness of the generalization gradient as 
the extinction incidental to testing for the presence of the gradient 
progresses. The Brown results arc reproduced as Figure 22. Note 
the progressive increase in the slope of the gradients as the ordinal 
number of trials increases. Incidentally the level rises at the point 
conditioned, showing that aEa is being withdrawn all along the 
generalization gradient. To be wholly convincing, of course, such 
an experiment must remove only the S3 Ri, whereas Brown’s 
and Hovland’s experiments, since the testing trials were not rein- 
forced, presumably removed a portion of Si Ri as well. 

DifFerential Reinforcement Applied to Si (+) ond Si ( — ) on the Stimulus 
Continuum Only 

With the theoretical elimination of the influence of the incidental 
reaction potential (S3) by having it reduced in strength to a prac- 
tical threshold value, we may now consider the effects of differential 
reinforcement on the stimulus continuum without this complication. 
This is in a sense the heart of discrimination theory. This analysis 
will be made by the use oj the qualitalive or subjective (j.n.d.) scale tradi- 
tional in the treatment of siimulus-inlensity discrimination. 

It will be recalled that the stimulus continuum of our example 
• was the range from black (S 1) through the series of intervening grays 
down to white (S2), the response (Ri) being originally connected 
to black by simple reinforced association. Now by the principle of 
stimulus generalization, the reaction potential of Si->Ri would 
extend in diminishing amounts toward white. Unfortunately we 
do not know to what low values this gradient spontaneously falls, 
i.e., without the influence of the extinction effects of S2 Ri. 
Indeed, strictly speaking we do not know whether it spontaneously 
falls at all. This is because, up to the present time, when an empir- 
ical test for the gradient is made it is done with the responses which 
are evoked to the various test stimuli on the continuum always 
non-reinforced, which naturally introduces a certain amount of 
extinction all along the continuum tested for generalization. This 
sometimes has been considered (72, p. 125) to indicate that the 
parts of the continuum remote from the point positively reinforced 
may be more susceptible to the influence of experimental extinction 
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action. However, it may be added that many stimuli which arc 
present nearly all the time are believed to be neutralized early 
in life for most responses by the process just described, and there- 
fore do not undergo the further neutralization. 

In terms of our equations as applied to Figure 21, we simply 
reduce the values of the summated (-{-) but distorted gradient 
shown above by 3.50" less the value of bLr. This amounts to a 
reversal of the summational procedure which produced the gradient 



"P"»«"tatlon of Iho Btadud appearance of the stlmolu. general- 
ThteS^r ® reinforcemeol. Note that the padient 

“rie. t^Se^jrn" “■* ■*“' l„.tead oTfalling 

“2 v 2wT “ R'Ptodoced by perotbtion of J. S. Brown 


DerformM "’""bolical reversal (withdrawal, or -) is 

reveairfhe.r a’' substantially 

bottom of ‘cnt in its proper position as represented at the 

theorem; °° “"^idcrations, we arrive at our twelfth 


‘0 the practical neutralization ofi rte""'’ 

position at the end of fh a ^ ^f^bstantially its true form and 
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On the Other hand, the hypothesis of dificrcntial resistance to 
experimental extinction is purely ad hoc. 

The upper gradient in Figure 23 is accordingly constructed on 
this hypothesis, Si necessarily falling on ad value of zero. The values 
of this gradient are calculated by means of the equation, 

,Eit = 4.0 X 

In the differential reinforcement process in this case the S 2 is 
assumed to fall at d = 30, which reduces the reaction potential 
at that point to the reaction threshold, taken as .355 (4). Now this 
reduction in reaction potential is due in the main to the accumula- 
tion of conditioned inhibition. Moreover, conditioned inhibition 
generalizes on the stimulus continuum substantially as does reaction 
potential (X G). The slope of neither gradient is known with 
precision, but they evidently do not differ very much. We shall 
accordingly here assume them to be equal. 

But in order to calculate the generalization gradient of sIb, we 
must know the amount that is to be withdrawn from the bEr at 
d « 30. Now this bEr value is 1.6868(7, and this withdrawal must 
be of the variety which requires the use of equation 13. In this 
equation, therefore, 

C - 1.6868(7. 

s,Er = bLr = .355cr, 

U = 6.0(7, 

and 

BjEr = bIr. 

Substituting in equation 13, we have, 

M(G-e.ER) 

M - b.Er 
6.0(1.6868 - .355) 

6.0 - .355 
6.0 X 1.3318 
5.645 

bIr = 1.4156. 

Accordingly the generalization gradient of conditioned inhibition 
is calculated on the equation, 

bIr - 1.4156 X 
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than are those which are closer, and that this differential resistance 
to extinction may be an indispensable contributing cause producing 
the gradient. 

However this may be, one thing is clear: the conditioning of 
Ri to Si at one point of a stimulus continuum in fact sets up the 
potentiality of a generalization gradient which becomes manifest 
either (a) by differential resistance to extinction, or (b) by the 

CCV1AT10NS (d) FROM CONOHIONtO STIMULUS IN j.n.d.'a 
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this discrimination gradient will be the stimulus generalization 
gradient without change. If wc move S 2 up to a d of 80 or 70, it is 
evident to inspection that the maximum amount of sIr will be so 
small that its generalization will not change the stimulus generaliza- 
tion gradient appreciably. However, as we move S 2 up to d — 60 
and especially to 50, it is clear that the curvature of the discrimina- 
tion gradient will grow less, its slope will grow more steep, and its 
height, bEr, also will grow less, ultimately becoming zero when 
d is 0. Various theoretical discrimination gradients of this type, 



FIG URE 25. Graph representing the theoretical net discriminatory reaction potential 
(s^r) as a function of the difference (d) between the reinforced and the discriminated 
stimulus. 

as computed on the above assumptions, are represented in Figure 
24. The d value involved in each is indicated by the lowest level 
(bLr) reached by each. 

Because of the theoretical importance of the relationship, we 
have also plotted the net discriminatory reaction potential (bEr) 
remaining after complete differentia! reinforcement as a function 
of d. This is shown in Figure 25. There it may be seen that the 
bEr begins falling very slowly as d decreases, and falls progressively 
more rapidly as d approaches zero. 

Finally it may be stated that the ratios of bEr to d were calculated 
for the values represented in Figures 24 and 25. They were found 
to increase progressively as d approaches zero. 

Generalizing on the preceding considerations, we arrive at our 
thirteenth theorem: 
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It is plotted as the shaded double-winged gradient at the bottom 
of the figure, inverted as it must be when plotted against the nega- 
tive scale at the left. 

Next the generalized conditioned inhibition must be withdrawn 
(-^) at each point from the corresponding reaction potential by the 
repeated use of equation 13. A graphic representation of these 
differences is shown as a broken line between the two basic gradi- 
ents. The portion of this difference function falling between Si and 
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(bIe) on S 2 generalizes to some extent to Sj, which also contributes 
to the fall of the “black” curve, though not so much. But Si is 
reinforced at every response in which it is involved. It follows that 
the “white” curve falls practically to its reaction threshold as 
differential reinforcement continues, whereas the “black” curve 
remains fairly high. 

Simple Discrimination with Three Discriminanda Presented Separately 

Our discussion of discrimination learning up to the present has 
usually been concerned with the use of two discriminanda { 20 ) or 
stimuli, one of which is positive or reinforced (+), and the other 
of which is negative or nonreinforced (— ); the formula in this case 

is, therefore d . The next most common number of discriminanda 

used experimentally has been three. Of these, the combination 
which has been utilized most is that where one stimulus is rein- 
forced (-h) and the stimuli at each side on the stimulus continuum 
are not reinforced (— ). The formula for this type of triple dis- 
criminanda is — -h — . We proceed now with the quantitative 
theoretical analysis of this type of discrimination. It is somewhat 
more complex than that for two stimuli just described, but the 
principles are at bottom the same. 

Let it be supposed that we have a learning situation substantially 
like that represented in Figure 23 except that the two negative 
stimuli are placed at 10 d units on each side of the reinforced 
stimulus (Figure 26). We first calculate the generalization where 
d = 10. 

bEe = 4.0 X 
= 3.0. 

From this we withdraw {—) the reaction threshold of .355 by 
using equation 13. Substituting, we have, 

6.0(3.0 - .355) 15.87 _,„,, 

6 - .355 5.645 

which is the amount of bIe which must be developed at d = 10 to 
reduce the bEe at that point to the reaction threshold. 

At this stage we must determine the amount of conditioned in- 
hibition which is generated in connection with each of the two 
negative stimuli. They are separated by ttvice d — 10, or d = 20. 
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THEOREM 13 A. When differential reinforcement is applied to 
Si* and S 2 ., with d practically the entire range of the stimulus contin- 
uum, discrimination between the two will be learned by the average 
rat. 

B. The gradient of bEr evocable by the stimulus continuum as a whole 
becomes markedly concave upward, with an asymptote which is the 
reaction threshold (bLr). 

G. As the range of d is decreased, the discrimination gradient con- 
necting Si and S 2 becomes less concave upward, passing into a slightly 
convex upward form after d = 30. 

D. As the range of d is decreased, the more nearly vertical will be a 
straight line drawn from the reaction potentials of Si and S 2 . 

E. As the range of d is decreased, the smaller will be the net dis- 
criminatory reaction potential attached to Si, bEr equalling zero when 
d equals zero. 

F. As the range of d is decreased, me ratio of the bEr to d at Si 
increases with a positive acceleration. 


Up to the present time reports of experimental work on the 
extremely simple form of discrimination learning which is the basis 
° preceding analysis have appeared in only three published 
studies (3; J; 76). So far as reported, none of these studies disagrees 
noticeably with Theorem 13. Frick (3, p. 119 ) presents results 
15 A and 13 B. and Raben reports data 
Ld harmony with 1 3 A and 1 3 B. 
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as the value of the conditioned inhibition at each of the two points 
of differential inhibition. 

We next calculate the generalized conditioned inhibition at the 
point of intersection of the two generalized inhibitory gradients 
midway between them (d *= 10), 

Bln = 2.052 X 
= 1.538. 


But since there are two of these values at this point they must be 
summated by equation 11, 


1.538 4- 1.538 


= 2.682, 

6 ’ 


which is the value of a typical point on the dot-dash line immedi- 
ately beneath. This line represents the summation (-f) of the two 
generalized aln gradients throughout the range represented. 

The final step of the determination of the discriminatory reaction 
potential is to withdraw (-^) this 2.682 of sin from the maximum 
reinforcement at the top, 4.0<r. Once more using equation 13, we 
have, 

F 6(4.0 - 2.682) , 

6 - 2;S82 


This is represented by the high point of the broken line above Si 
and is to be compared with the b6r yield by the two discriminanda 
(see Figure 25), which amounts to 2.91 <7, represented by the 
isolated dot directly above. 

Generalizing from the preceding considerations, we arrive at our 
fourteenth theorem: 

THEOREM 14. Discrimination learning with three discriminanda 

in the form { is possiblCy but is more difficult than is comparable 

discrimination learning with two discriminanda^ , because in the 

— -h — form the conditioned inhibition (bIr) generalizes upon the 
reinforced reaction potential front both sides, summating at Si, the 
slope of this summation gradient being much less sleep than would be a 
single bIr gradient from the same maximum. 

No experiments involving the separately presented stimuli dis- 
crimination analyzed above in the use of the — + — form of 
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Substituting in the usual generalhtation equation, we have, 

,1, = ela X 
bIr = .5624fllR. 

This means that there will be combined the full .la at each dis- 
crimination point, and the generalized bI* from the other point. 



r I c 1 ) R E 26 Diagramtnatic representation of the interaction of the gradient of reac- 
tion potential (sEr, the solid line above) with the two gradients of conditioned inhibi- 
tion (bIr, the two crossed solid lines through the stippled region below) which com- 
bine (+) to produce the total bIr (dot-dash line below). The withdrawal (-^) of these 
latter values from the bEr values above yields the discriminatory reaction potential 
represented by the broken line between. The isolated circle above the maximum of the 
broken line representing the net discriminatory reaction potential (bEr) shows, for com- 
parative purposes, the net discriminatory reaction potential yielded by two discrim- 
manda at d - 10. as shown in Figure 25. The figure as a whole represents the dynamics 
c -r type of triple-discriminanda discrimination. 

which together make up 2.8Uff of conditioned inhibition. Substi- 
tuting appropriately in the summation equation 11, we have, 

•In -i- .5624,1, = ,1, -f .5624,1, - iln X .5624,1, 

= 1.5624,1, - .09373,P, = 2.811. 

Solving this equation and taking the negative root, we have, 
tl, = 2.052, 
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the assumption that D *= 6,0<r, Vj = .939, and K = .71. How- 
ever, in the quantitative or stimulus-intensity aspect of stimulus 
generalization of reaction potential, two components of this equa- 
tion are varied, viz., sHr, and Vz, the product of the other two 
(D X K) being held constant at 4.0572(7. But here we must note 
a striking fact: not only are Vi and Vz dependent upon stimulus 
intensity (S), but bHr is also. The novelty is found especially in the 
derivation of sHn, which we now proceed to consider. In the present 
situation the equation for generalized habit strength is, 

e.HR = b.Hr X 10-'*^ (44) 

where d, instead of being the difference between Si and Sa in 
j.n.d.’s as in qualitative stimulus generalization, is the difference 
between the logarithms of Si and Sj. 

To make this and its role in stimulus-intensity generalization 
quite clear we shall present a typical derivation of a theoretical 
generalized reaction potential. The assumed conditions are that 
the habit was set up to a visual stimulus of 1010 units (e.g., milli- 
lamberts), and that it generalized to, i.e., the response was evoked 
by, a stimulus of 10 units of intensity. The logarithms of these two 
stimulus intensities are, respectively, 3.00432 and 1.00000 (see 
Table 6). The difference between the two logarithms is 2.00432, 
which equals d. Substituting this value in equation 44, we have, 

b.Hr - X 

= X 10--=>o'’e6. 

Assuming that the original habit strength was at its maximum, i.e., 
that 

b.Hr = 1.0, 

we have, 

b,Hr = 1.0 X 2.0255 
= 1.0 X .4937 
.-. ,.H„ = .4937, 

which appears as the fourth entry of column 1, Table 6. 

We next substitute log S, as always in equation 6, 


Vz = 1 — 10^ 


( 6 ) 
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triple discriminanda have been found. However, several studies 
have been reported which used the more complex form of simul- 
taneous presentation or comparison of three different-sized visual 
objects. Lashley {14, p. 164) used tmee white circles on a black 
ground and found that rats did not learn to choose the middle-sized 
circle within the amount of training given, tnough in the -| - 

form this same amount of training was sufficient to yield the choice 
of either the largest or the smallest circles, Spence {IS), working 
with chimpanzees and using squares of 100, 160, and 256 sq. cm, 
in area made of white enameled sheet-iron clamped to separate 
food boxes, found that the problem concerned with the intermedi- 
ate-size of square was learned on the average in 145 trials, whereas 
the problem concerned with two such discriminanda was learned 
in a mean of only 80 trials. Thus the empirical evidence on this 
type of discrimination learning agrees substantially with the results 
of the theoretical analysis of the separate presentation form. More- 
over, an examination of the analysis represented in Figure 26, which 
follows substantially Spence’s analysis (7P, pp. 259 ff.), indicates 
that the essential reason for the greater difficulty of the discrimina- 
tion problem involving intermediate size is that in the latter the 
bIs generalizes from two directions, converging upon Si where it 
summates, this summation being much greater than would be the 
ordinary generalization gradient from the maximum conditioned 
inhibition at S 2 . This is shown by the gentle slope from Sg to Si of 
the bIr (dot-dash) line at the bottom of Figure 26. 

The Generalization of Reaction Potential (bEr) Based on Stimulus Intensities 
In a preceding section of this chapter we observed the role that 
qualitative or subjective stimulus generalization (j.n.d. scale) plays 
in the determination of reaction potential. It is now our task to 
consider how this operates in the case of quantitative or objective 
stimulus intensity as measured by a physical scale. 

Here the determination of is based on equation 20, 

b.Er = D X K X V 2 X b.Hr, (20) 


where Vi represents the response evocation conditions. In our treat- 
rnent of qualitative stimulus generalization earlier in the chapter, 
all values on the nght-hand member of this equation were held 
constant except b'^r, the product of D X V, X K being 4 . 0 £r on 
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B. When a corresponding reaction potential (sEr) generalizes from 
a weaker to a stronger stimulus intensity {in the lower range of magni- 
tude) the gradient is relatively gentle in its downward slope and its 
curvature is for the most part concave upward. 

C. The gradient originating at the weaker stimulus intensity has a 
markedly lesser sEr at d — 0 than the one originating at the stronger 
stimulus intensity. 

We are fortunate in having empirical evidence bearing directly 
on the soundness of this theorem in an experiment reported by 



FIGURE 27, Graphs representing theoretical stimulus generalization reaction poten- 
tial gradients starting at opposite extremes of the same range of stimulus intensities and 
extending to the other extreme. They arc plotted on the basis of ordinary stimulus- 
intensity units. The points of origin arc represented by solid black circles. 

Judson Brown (2). But before considering Brown’s results let us 
note a striking change which takes place in the curvature of the two 
theoretical gradients when they arc plotted on the basis of log S, 
only three of the data points being used — the first, last, and that 
which falls at the mean of these two log values. In order to show 
this wc have rcplotted the three points in question as Figure 28. 
There it may be seen at once llial the gradient of generalization 
from a strong to a weak stimulus extreme now has a slight but clear 
concave-upward curvature, whereas the gradient generalizing 
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and solve for V„ securing .6391, which appears as the fifth entry 
of column 1. Table 6. We now use equation 20 in the form, 

,E„ = Vj X bHk X 4.0572(r. 

Substituting in this equation from Table 6, we have, 

= .6391 X .4937 X 4.0572 
.-. = 1.279, 

which appears as the bottom entry in column 1, Table 6. 

The bEh values in the other columns of this table were calculated 
in an exactly analogous manner. They are represented graphically 


TABLES. Tabic showing the derivation of a theoretical generalized reaction potential 
(bEb) gradient based on stimulus generalization from a strong (Si) to a weak (Sj) 
Stimulus intensity. 


Stimulus intensity (S) 

10 

too 5 

210 

410 

610 

810 

1010 

Logs 

d from strong to weak 

1. 00000 

2 00216 

2.32222 

2.61278 

2.78533 

2.90849 

3.00432 

stimulus intensity 
Generalized habit 

2 00432 

t .00216 

.68210 

39154 

.21899 

.09583 

.00000 

strength (sAr) 
StimuluS'intensity dy* 

.4937 

.7075 

.7901 

.8735 

.9153 

.9676 

1.0000 

namism (V) 
Generalized reaction 

.6391 

8623 

9052 

9292 

.9405 

.9475 

.9524 

potential 

(iHr X V, X 4.0572 








“ bEb) 

1.279 

2 475 

2 902 

3.293 

3 538 

3.720 

3.864 


in the continuous-Une curve of Figure 27. The corresponding reac- 
tion potential values where the generalization extends in the 
opposite direction, i.e., from 10 units to 1010 units, were calculated 
by exactly the same principle but arc not shown in the table. These 
results are also presented in Figure 27, but by the broken line. 

Generalizing on the preceding considerations as represented in 
Figure 27, we arrive at our fifteenth theorem: 

theorem 15 A. When a reaction potential (sEr) generalizes 
from a stronger to a weaker stimulus intensity {in the lower range of 
magmtu e), the gradient is relatively steep in its downward slope and 
Its cunalure is convex upivard. 
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represented in Figure 29, after the manner of the theoretical 
Figure 28. 

A comparison of Figure 29 with Figure 28 shows that when 
plotted against log intensities: 

1. The generalization gradient from strong toward weak agrees 
in having a relatively steep fall with a concave-upward curvature. 



FIGURE 29. Graphic representation of empirical reaction potential as modified by 
stimulus-intensity generalization and stimulus-intensity dynamism, corresponding 
roughly to Figure 28. Plotted from data published by Brown (2) u represented in a 
previous publication by the present author {IS). 

2. The generalization gradient from weak to strong agrees in 
having a relatively gentle fall with a convex-upward curvature. 

3. The origin of the weak-to-strong gradient is considerably 
lower than that of the strong-to-wcak gradient. 

Thus all three points of Theorem 25 appear to be substantiated. 

At this point we note an implication of Theorem 15 and Figure 
27 which arises from the fact that stimulus generalization logically 
extends in two directions from an intensity to which a response 
has been reinforced. Thus if the two gradients shown in Figure 27 
originated at the same point, i.c., at S = 1010 units, and extended 
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from a weak to a strong stimulus extreme has a definite convex- 
upward curvature, an exact reversal of the direction of curvature 
in each case from that shown in Figure 27. _ 

With this relationship in mind we turn to Brown’s stimulus- 
intensity generalization investigation. He trained two groups of 
rats to go to food on a straight runway. During the learning the 
food reward was associated with screens which were illuminated 
by a very weak light with one group, and by a very strong light 
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CENERWJZATION GRADIENT TO 
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F 1 o u R E 28. Three-po.nt graptu reprcsentiQg theoretical stimulus generalization reac- 
tion potential gradients plotted on the basis of log stimulus intensity. Note that the 
three values of each gradient here represented are exactly the same as the correspond- 
ing values represented in Figure 27 but that the difference in the manner of plotting 
reverses the curvature of both gradients. 

with the Other group. After the learning, Brown secured a quanti- 
taiivc measure of the strength of the rats’ tendency to go to screens 
u -r illumination. As the measure of this reaction potential 

e uii ize t c uiean magnitude of the pull of the animals when put 
in a htUe rubber harness halfway down the runway from the 
^rccn. c test lights placed on the screen were: the extreme weak 
illummation. the extreme strong illumination, and an illumination 
corresponding approximnlely to the mean log of the intensity of 

hL 1 <=ach group of animals pro- 

duced a different generalization gradient. These gradients are both 
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Next, reaction potential at Si is calculated by equation 20, 

sEn = D X Vj X K X .H„. (20) 

But here the Sz (of response evocation) is the same as the Si (of 
original learning). But since. 


and 

it follows that. 


D X K X 8,H„ = 4.0572, 


Vi = .4568, 

b.E„ = 4.0572 X .4568 
= 1.8533, 


which appears as the fourth entry in the first number column of 
Table 7. 


TABLE 7. The major steps in the computation of the theoretical discriminatory 
reaction potential (afin) at the intensity stimuli of reinforcement for two adjacent 
pairs of stimuli (Si and the contrasted stimulus, Si) in millilamberts. 

1 Stimulus intensities (S) in From 4 (Sj) to From 24 (Sj) to From 24 (Si) to 


discrimination 

2 StimuluS'intensity 

24 (S,) mini. 

lamberts 

4 (Sj) millilam- 
berts 

44 (Si) millilam* 
berts 

dynamism (Vj) 
i StimuluS'intensity 

.4568 

.7530 

.7530 

dynamism (Vj) 

4 Reaction potential at Si 

.7530 

.4568 

.81082 

(V, X 4.0572 = 8iEr) 

5 Generalized habit 

1.8533 

3.0551 

3.0551 

strength (sHr) 

6 Generalized reaction 
potential (sEr) at Sj 

.76435 

.76435 

.91308 

(sHr X V, X 4.0572) 

7 sLr withdrawn ( — ) from 

2.3351 

1.4166 

3.0037 

bEr (at S,) = sJr 

8 Generalized giR (at Si) 

2.1046 

1.1284 

2.8153 

i-c., b,Ir X sHr *= Bi!r 

5 SiIr withdrawn (-^) from 

1.6087 

.8624 

2.5706 

BiEr at Si = siEr 

.3342 

2.5608 

.8477 

At this point we calculate the generalization of the sHr from 
Si to Si: 

..Hr = sH„ X 

= 1.0 X .76435 
..Hr = .76435. 
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over a total range from 10 to 2010 units, the broken-line gradient 
would have Its point of origin increased to 4.057<r (through the 
increase in V,) and would extend to S = 2010, falling appreciably 
less than it does in Figure 27. From these considerations we arrive 
at our sixteenth theorem: 

THEOREM 16. Stimulus generalization reaction potentials (sEr) 
extend in both directions along a stimulus-intensity continuum from a 
single point of reinforcement^ the wing extending toward increasing 
stimulus intensities on the whole having much higher reaction potentials 
for given deviations from the reinforcement pointy especially as d 
increases. 


We have been unable to find any empirical data bearing directly 
upon Theorem 16 as distinguished from Theorem 15. 

The Simple DiscriminaHon of Ob|ective Stimulus Intensities 
We shall continue the subject of stimulus-intensity generalization 
in this section by considering how the same factors operate jointly 
in simple stimulus discrimination. It may be recalled that in the 
discrimination considered above (Figure 23) the data dealt with 
were treated as non-quantitative, i.e., they were treated as quali- 
tative in nature, since the value of V was held constant and the 
d values were based on j.n.d.’s. 

Let us now consider the theoretical discrimination of two light 
intensities in the lower range: 4 units and 24 units (millilamberts). 
1 he d m this case is therefore the difference between the logarithms 
of 4 and 24 respectively, i.e., 


1.38021 — .60206 = .77815. 

generalization of reaction potential con- 
s^dcred the .mmed.a«ly preceding section, we shall assume that 

don exoone^? ’ ^at the generaliaa- 

hon exponent 15 -.15 (equation 44). 

arc entered “’”P'**3tion of this discrimination 
o^cltcd hv . V Table 7. V. and Vs are 

and the S. is 24 u "d" -rn-''”'’ ^ above, the Si is 4 units 

X of 7530 LT 1 “ °f-«68 and a Vs 

of the first number.coCro7Se“.‘''" 
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from our postulates with a reaction-potential reduction from 1.8533 
to .3342, the outcome being in this respect much like the qualitative 
discrimination considered above (pp. 69 IT.). 

Generalizing on the preceding considerations, we arrive at part 
A of our seventeenth theorem: 

THEOREM 17 A. Simple stimulus-intensity discrimination can be 
learned, with a reduction in reaction potential as an inverse function 
of d, much as is the case with qualitative discrimination. 

Our next problem concerns the theoretical effectiveness of dis- 
criminatory learning as dependent on whether the reinforced 
stimulus is the more or the less intense of a pair of stimuli which are 
identical except that one is reinforced and the other is not. In the 
case represented in number-column 1 of Table 7, tiic intensities 
were 4 and 24; the stimulus of less intensity, 4 units, was reinforced, 
and the one of greater intensity, 24 units, was extinguished. We 
acpordingly proceed to calculate the possible net reaction potential 
(sEr) of the opposite case, where the stimulus of greater intensity, 

24 units, is reinforced and the one of less intensity, 4 units, is not. 
Employing the same principles of determination as before, we 
secure the results which appear in number-column 2 of Table 7; 
row 9 of this column shows that the final net reaction potential is 
2.5608, as distinguished from the .3342 of the reverse situation. 

Generalizing from the preceding considerations, we arrive at 
part B of Theorem 1 7 ; 

theorem 17 B. When the simple discrimination of two stimulus 
intensities occurs, the difference between the intensities remaining 
constant, the process is more effective in terms of the net reaction poten- 
tial yield when reinforcement is given to the more intense rather 
than to the less intense of the two discriminanda. 

Our final problem here concerns the theoretical influence on 
discriminatory learning effectiveness resulting from an increase of 
the stimulus intensities while the difference between them is kept 
constant. In the two cases just considered the difference was that 
between 4 and 24, or 20 units. Let us now examine the discrimina- 
tory effectiveness of the intensities when they are increased to 24 
and 44 units respectively. The derivation of sEr in this case is 
presented in the third number-column of Table 7. A comparison 
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This value appears as the fifth entry in the first number column of 
Table 7. From this we secure the generalized reaction potential: 

b,Er = b.Hr X Vs X 4.0572 
= .76435 X .7530 X 4.0572 
= 2.3351, 


which appears as the sixth entry in the first number column of 
Table 7. 

Now this value is extinguished by differential reinforcement to 
the reaction threshold, which as usual is taken as .355o-. We ac- 
cordingly withdraw (-=-) .355<r from 2.3351 to determine how much 


sU will be generated in the process. This is done by means of 
equation 13. Substituting appropriately in this equation and solv- 
ing, we find that bIr will be 2.1046, which is the seventh entry in 
the first data column of Table 7. Next, this generalizes back 
irom bi to S,. There is considerable uncertainty concerning the 
° f .!^ Stttteralization of sIr. Considerations of an a priori 
nature fail us here and, as always in such cases, resort ultimately 
1" provisional computations given 
thX^f has disappeared, though 

uncer ‘his a^d other 

uncertainties the computation is given in detail: 


«!» = 10- '‘X-”81S X 2.1046 
= .76435 X 2.1046 
= 1.6087. 


SleTpLuV'jh-"®’'"' ™-h=r column of 

'•B533. Substituting in cquaTonTw havr“’‘" 


■f.R = d.0(1.8533 - 1.60871 

_ 1.4676 
4.3913 ~ -3542. 


Solving, wc secure .3342 whirl. 
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ciency (sEr) grows less as the intensity of the two discriminanda is 
increased, the absolute difference between the stimuli remaining 
constant. This is neither Weber’s law nor Fechner’s law, though 
it is closely related to both. 

This long-known relationship is apparent, though in a somewhat 
indirect manner, in the very specialized type of discrimination 
involving the separate presentation of the discriminanda already 
considered in this chapter. After Antoinetti’s animals had about 
reached the limit of discrimination learning between black and 
white, he proceeded gradually to decrease the difference between 
the discriminanda of his two groups of animals in such a way that 
one group was discriminating between two discriminanda at the 
lighter extreme (near white) and the other group at the darker 
extreme (near black) of reflectance intensity yielded by the Munsell 
coated papers. At the darker stimulus extreme the stimuli were 
1,210 per cent and 6,555 per cent respectively, with a reflectance 
difference of 5.345 per cent. This yielded an bEr value of 1.23cr. 
At the lighter stimulus extreme the two stimuli were 78.66 per cent 
and 43.06 per cent respectively, with a reflectance difference of 
35.60 per cent. This large difference yielded the comparatively 
small bEr of .90o'. But, 

1.23a' > .90a', 

which is ihe exact reverse of Theorem 17 C. This difference is not 
significant. 

The above results are complicated by the fact that in the case 
of the weak stimulus discriminanda the weaker stimulus of the pair 
was the one reinforced, whereas in the case of the strong dis- 
criminanda the stronger stimulus of the pair was the one reinforced. 
As shown by Theorem 17 B and related evidence, this would have 
favored the apparent discrimination power of the more intense 
pair of discriminanda. Even so, the inverse relation of discrimina- 
tion power to the intensity of the two discriminanda is very marked. 
We accordingly conclude that Theorem 17 C is not valid. 

This obviously implies a serious defect somewhere in the postulate 
set, or in the analysis of the process, or in both. Nevertheless we arc 
of the opinion that this or any other system of behavior theory if 
thoroughly sound must be able to deduce the fact that discrimina- 
tion efficiency grows less, other things equal, as the stimulus 
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of the last item in number-columns 1 and 3 shows that 


.3342 < .8477, 

i.e., the sEr for discriminating an intensity of 24 units from one 
of 44 units is greater than that for discriminating one of 4 units as 
compared with 24 units. This, however, is probably contrary to 
empirical expectation. 

Generalizing on these considerations, we arrive at part C of The- 
orem 17, even though it may be contrary to empirical expectation: 

THEOREM 17 C. When the simple discrimination of two stimulus 
intensities occurs, the difference between the intensities remaining con~ 
slant, the effectiveness oj the discriminatory process in net reaction’ 
potential (bEr) yield increases as the intensities of the two discrimi- 
nanda increase. 


A small amount of evidence bearing directly on all three parts 
of Theorem 17 is found in Antoinetti’s unpublished experimental 
results. These indicate that the type of discrimination learning 
there shown, with the more intense of the discriminanda reinforced, 
was successful. The reflectance difference between the two papers 
used as visual objects was 77.45 per cent. The bEr was 4.36(r. Thus 
part A of Theorem 17 finds partial empirical verification. 

Regarding the validity of part B of Theorem 17, Antoinetti’s 
results show that in the reverse case from that just considered, where 
the less intense stimulus of the pair was reinforced, the bEr yielded 
was only 2.65<r. But, 


4.36(r > 2.65(7. 

This difl'erence is significant at the five per cent level of confidence. 
vcrTicafion^ ^ Theorem 17 finds apparent empirical 

As *<= validity of part C of Theorem 17. 

fact is probably contrary to empirical 

with the tree .“f » system must be exhibited along 

Actuallv ^^vance the science adequately- 
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This relationship Ts T 
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beginning of the experiment without the buzzer stimuli, and the 
apparatus is set so that a movement to the right will give a pellet, 
then a movement to the left will give a pellet, and so on in an 
irregular alternation up to a total of ten pellets. After this pre- 
liminary training the manipulandum is carefully cleaned and the 
apparatus is set to give the reward only while the particular sound 
is being presented as described above. At this stage, but one trial 
will be given per day and the sound stimuli will be alternated, one 
at each trial, in an irregular manner so that in the course of forty 
days, twenty sounds will be of loud and twenty will be of moderate 
intensity. 

It is to be expected that the “incidental” stimuli of the apparatus, 
and so on, will be associated about equally with both R’s. As in 
simple discrimination, these stimuli (S?) will receive reinforcement 
and non-reinforcement to approximately the same extent after the 
preliminary training. This will tend to equalize their strengths as 
the b,Er and the 8 ,Ir approach their respective asymptotes so that 
in time differential reinforcement will approximately neutralize 
the incidental reaction potential involved. But the primary theo- 
retical problem here is the discrimination of the strong from the 
moderate sound intensity of the stimulus continuum; i.e., to gener- 
ate conditioned inhibition to the generalized tendencies from each 
sound intensity to the response proper to each. This means that 
there are four stimulus-response combinations: 

1. When Si is followed by Ri, that combination generates 8,Er,. 

2. When Si is followed by Rz, that combination generates b.Ir,. 

3. When S 2 is followed by R 2 , that combination generates b.Er,. 

4. When S 2 is followed by Rj, that combination generates b.Ir,. 

It is evident from the above description that the theoretical 

problems of both simple discrimination and simple trial-and-crror 
learning are involved here. In terms of associative connection 
analogous to that represented in Figures 1 and 17 with two con- 
necting lines each, one right and one \vrong, we now have the 
situation represented by Figure 30, with four connecting lines, two 
right (reinforced) and two wrong (non-rcinforced). 

So far as we can now sec this presentation of Si and S 2 by an 
irregular aitcrnaiion will produce a type of learning at each stim- 
ulus substantially like that represented in Figure 23, with the 
additional factor of response generalization between Ri and Ra 
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intensity of the discriminanda grows greater, even though this 
deduction has never been made. 

Simple Single>Stimulus Presentation Discriminatory Trial-and-Error Learning 

While a certain amount of the simple discriminatory learning of 
the sort analyzed above has been reported (3; 76), the greater part 
of the work done in this field has been combined with trial-and- 
error learning in one form or another. It will be convenient at this 
point to introduce this combination of processes in a simple form. 
Let it be assumed that the albino rat is in an apparatus something 
like that shown above in Figure 13, except that there is only one 
manipulandum. This projects straight into the experimental 
chamber through a slit in the metal panel. It can be moved a 
little to right or left, one or the other movement automatically 
releasing a pellet of food into the food-cup according to the stimulus 
being presented and the response made. At once after the response, 
w 1 C ever it is, the manipulandum is automatically withdrawn 
roug t e panel to a point inaccessible to the animal so that no 
further manipulation on that trial will be possible. The critical 
discriminated in this case are buzzer sounds, let us 
a In.ia ® *'“'*• two Stimuli chosen are 

sound isT r 50 decibels, and a less intense 

that when t°b apparatus is so arranged 

Ldum 1 a and the manipu- 

cup bu, fTh "S'*' of f™d always drops into the 

the 50 d “'f* P'=H-=t is found; and when 

is towkrd the righ™ In 'T "o' ‘f ‘f*' movement 

drawal of the manir^ i either false movement, the with- 

on that trial. The tadt^of thT "‘s any second choice 

sound (Si) with th u • is to associate the loud 

+ food; and masstiatr ^-o-ont (R.) 

of the left-hand movement 

of the movements is the trial 'olection (R, or Rt) 

the power of makinR this nl, "f 'ho learning, and 

respective sound signals fs n s “'yoctly on the basis of the 
the learning. * ^ discriminatory aspect of 
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the cues closely, looking first at one and then at the other before 
lifting the lid of one of the boxes {18, p. 432, footnote 3). 

The movements involved in examining the cues, i.e., in exposing 
the receptors to the relevant stimuli in such a problem situation, 
will be referred to as receptor adjustment acts. The detailed theory 
of the evolution of this type of habit will be presented later (Chap- 
ter 6) in connection with an account of compound trial-and-error 
learning, of which it is a small-scale example. 

In Lashley’s experiment a rat was placed on a stand separated 
some inches from two doors on each of which was a circle of white 
cardboard, one larger than the other. When the rat leaped against 
the large circle, say, the door would swing open easily and in the 
compartment beyond would be found a bit of food; the door with 
the small circle, on the other hand, would in such case be locked 
so that if the animal leaped against this circle it received a punish- 
ing blow from the impact against the unyielding surface, fell a 
short distance into a net, and received no food on that trial. In 
the course of a few hundred trials, the number depending upon 
the differences in the areas of the two circles, the rat would gradu- 
ally learn to look first at one of the cards and then at the other 
before jumping, and to jump only to the larger circle. 

In KShler’s experiment hens were presented with kernels of 
grain on two sheets of gray paper, one sheet darker than the other. 
The hens were permitted to secure the grain from one shade of 
gray, but not from the other. After many trials the hens learned 
to attempt to eat only the grain which lay on the paper from which 
it could be secured. 

From a casual consideration of the three cases of simultaneous- 
stimuli presentation discrimination learning just cited, in the light 
of the preceding analysis, it is evident that we have to do here with a 
considerably more complex process than that of simple discrimina- 
tion learning. In addition to discrimination itself and the response 
selection of trial and error already considered in the preceding 
sections of the chapter, wc have the phenomena of comparison which 
results from the receptor adjustment acts. These acts themselves 
result, in the first place, from the oscillation (bOr) of bEr as mani- 
fested by the organism’s moving its head from side to side — 
Muenzinger and Tolman’s “vicarious trial and error.” The rein- 
forcement of such movements occurs when they chance to be 
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as discussed in detail in the description of simple trial-and-error 
learning (Chapter 2, especially pp. 22 ff.). With the reduction 
of the superthreshold reaction potential (sEr.) of S 2 — > Ri and of 
Si — » R", each to near zero, there will remain portions of Si — > Ri 
and S 2 —* Rj after the withdrawal from them respectively of the 



t— 1 rcprcscntaUon of the reinforced (+) and non-rcinforced 

discriminator 'it* ^ fwpectivc continuum stimuli and responses in a simple 

Sxroktn S. or S, is printed at a given 

reaction potentials (sEr)*" * represent generalized but maladapUvc {-) 


DotentWs'^ <■*"' discriminatory reaction 

activity product of the learning 


Joint-Stirndvr Pr„e„,a,lc„ Discriminatory Triol-and-Error loarning 

crirnatn lL7 “f ^^ple dis- 

trial-and-error Icar^nH h°'l, discriminatory 

tion of two stimuli tn h '"yolves the simultaneous presenta- 

empirical investieations'a Typical examples of such 

chimpanzees (771 i nshl ““"d in Spence’s experiments with 
('-<), and Kohler’s transMsitior™® “periments employing rats 
ally this procedure corrpsr. ^ penments with chickens. Actu- 
performed on humans in co”" ^ experiments 

In Spence’s experiment Webcr-Fechner law. 

separate food boxes on wh' k learned to identify two 

been placed. The box with ih T of different size had 

contained a bit of foorl wK ^ square could be opened, and 
‘ooked. If ,he 

It was permitted no second f>u • latter box 

Under such conditions thc^ 

organisms soon learn to scrutinize 



DISCRIMINATION LEARNING 


95 


appearance of a generalization between Si and S2 even if the latter 
completely lack this tendency. 

The primary process which gives rise to the discrimination of the 
stimulus complex is differential reinforcement. In the course of 
time with an equal number of occurrences of Si and S2, both 
s.Eri and sJri tend to approach their asymptotes, which will reduce 
b,Er, toward its reaction threshold. Moreover, since the rates of both 
learnings are reduced as the asymptotes are approached, presum- 
ably Sa loses much of its capacity to acquire not only response Ri 
but also other responses employing to different degrees the same 
effectors. This should produce extensive transfer-of-learning effects 
wherever S3 is involved. 

Along with the process of neutralizing S3 to Ri, the differential 
reinforcement gradually builds up Si -> Ri. But this reaction po- 
tential generalizes to S2 **♦ Ri, which is never reinforced, giving 
rise to sJbi which generalizes back upon Si -♦ Ri and reduces the 
latter appreciably. The net result of this double generalization of 
bEr and bIr is the marked loss by 82 of the power to evoke Ri and 
the retention of a considerable though reduced power of Si to 
evoke Ri. This seems to be the essence of discriminatory learning. 

Discrimination learning may involve numerous discriminanda, 
and their combinations may vary widely as to the stimuli which 
are reinforced and those which are not reinforced. For example, 
in the case of three stimuli there may be the formulae + — 

or -b -b ~ , or b , or 1 .Of these four we have analyzed 

only the last. It is believed that the same general methodology 
could be adapted to the remaining three, as well as to many other 
possible forms. Theoretical analysis reveals that — -b — had an 
appreciably weaker yield of net sEr at the limit of learning than 
+ — . 

Stimulus intensities show stimulus generalization much as do 
qualitative stimulus similarities, but with differences. The d is 
believed to be based on log S, which produces an asymmetry in 
the gradients extending toward increasing intensities as contrasted 
with decreasing intensities. This introduces certain differences in 
the theory of quantitative discrimination as distinguished from 
qualitative discrimination. The theory in its present stale yields a 
fair deductive agreement with both intensity generalization and 
discrimination empirical facts, but in each there is a rather clear 
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followed by a successful instrumental act which itself is reinforced. 
In such a situation the receptor adjustment act is reinforced accord- 
ing to the delay-of-reinforcement principle (iii). The reception of 
the patterns of the stimuli in close succession by the eye, say, consti- 
tutes the comparison. 

The effectiveness of comparison itself depends upon two prin- 
ciples. The first is the principle of the stimulus trace (II), which 
allows the stimulus first received to persist until the second or 
comparison stimulus is received. These impulses thereupon undergo 
an interaction (XI) which changes each to some extent but not 


entirely. This afferent interaction change in turn, through the 
principle of stimulus generalization (X A), has two effects: first, 
the degree to which the stimulus trace remains unchanged produces 
a tendency to generalization of the ordinary sort based on its 
original nature; i.e., the reaction potential primarily based on this 
stirnulus trace, through previous habit formation, will tend to be 
Stimuli, with the amount of reduction depending 
on the difference (d) between the two stimuli involved. This gives 
rise to responses based on the absolute nature of stimuli. The second 
i"««action change is the degree 
nan.r nrTv, ‘'“changed the stimulus; i.e.fthe degree and 

to the Stim' r **’* preceding stimulus 
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from the stimulus complex and the response latency noted and 
converted into reaction potential. This would be the Sa-^Ri 
magnitude except for possible interaction effects (XI). It should 
also be appreciably less than that before the sound removal. This 
value withdrawn (— ) from the original total reaction 
potential should yield the approximate value of 5,En,. 

A useful checking approximation to the above values with the 
same subjects could be secured by extinguishing the S3 Ri reac- 
tion by massed trials. Substituting this h in an appropriate equa- 
tion, = f(n), the equivalent 8 ,Er, value could be secured. 
Then by restoring the sound and extinguishing again, substituting 
the new h in the equation bEr = f(n), the siEr, value (except for 
interaction effects) presumably would be secured. 

THE DETERMINATION OF THE EXPONENT OF THE 
GENERALIZATION GRADIENT IN THE CASE OF 
STIMULUS INTENSITIES 

The splitting up of the causal factors of the stimulus generalization 
gradient (bEr) into two components (sHb and V) in the case of 
stimulus intensities raises the question of how the functions of the 
separate factors can be determined. A proposed procedure, which 
further illustrates the theory, is as follows. 

First, five or more groups of organisms would be taught a habit 
by the Hays procedure described above (p. 60 ff.), with the use of 
critical stimuli which increase by equal Jog intensities (milli- 
lamberts) starting at extremely weak values. The median response 
latencies of each of these five learning curves would be converted 
into bEr’s by equation 28, and learning equations fitted. The 
coefficients of these equations would then be plotted as a function 
of log S and a separate equation fitted to them. This would be the 
equation of the particular stimulus-intensity dynamism involved. 

The equation for V, uncomplicated by stimulus generalization, 
would be used in connection with equation 20 in the interpretation 
of stimulus generalization (bEr) gradients both (1) toward increas- 
ing and (2) toward decreasing stimulus intensities analogous to 
those of Brown (2) and Hovland (P), the gradients being stated in 
objective physical units such as millilamberts. Each bEr value 
secured would be divided by the V value of the corresponding 
stimulus intensity. These quotients would then be plotted separately 
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indication of defect somewhere in the postulates involved. The false 
implications of the postulates are presented along with those agree- 
ing with fact because it is believed that the progress of behavior 
science at its present stage is best served by presenting the defects 
as well as the virtues of a system. 

A simple form of discrimination learning combined with trial- 
and-error learning is found in a situation permitting on a given 
occasion only one or the other of two responses, Ri and R;, say, 
and presenting only one or the other of the stimuli Si and Sj. The 
detailed analysis of this type of learning was not performed. No 
report of separate-stimuli presentation discriminatory trial-and- 
error learning has been found in the empirical literature. 

The detailed analysis of joint or simultaneous stimuli presenta- 
tion trial-and-error learning, also, was not performed. It is evident, 
owever, t at in this type of learning three new factors enter: (I) 
the learning of a rcceptor-adjustmcnt act, (2) the trace of the first 
persisting until the second stimulus is received, and (3) 
re!nnn proccsscs. It is belicvcd that the 

TenerX.^r will be governed in part by the 

Ct since r ' stimuli (27; 2S). 
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from increasing and decreasing intensities as a function of the log 
stimulus intensity difference (d) producing each; these should be 
generalization gradients in terms of bHb multiplied by a constant 
which would be D X K X J. The present analysis anticipates that 
the resulting equations would, except for sampling limitations, be 
identical, and that they would represent the true generalization 
gradient of habit strength uncomplicated by the circular consider- 
ations involved in the use of the j.n.d. units hitherto employed. 
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The Derivation of the Stimulus-Trace Postulate (II) 

Reynolds {17) conditioned the blinking of the human eyelid to 
various ages of the stimulus trace initiated by a single click. A 
puff of air was delivered at varying intervals following the click 
to four different groups of human subjects. All groups acquired the 
conditioned-reflex blink after 90 trials separated by periods varying 
from one to two minutes. This clearly means that the stimulus 
trace at various ages can be conditioned to a response with dis- 
tinctly different resulting reaction potential strengths. Reynolds’ 
results showed that his several delay groups gave characteristically 
different per cents of overt blinking responses to the conditioned 
stimulus. These varying susceptibilities to conditionability of stim- 
ulus traces of various ages are themselves believed to be due to 
differential stimulus trace intensities at those ages. On the ten 
trials from 81 to 90 the four groups gave the results shown in the 
second line of Table 8. 

At this point we proceed to a theoretical analysis of these and 
related data with a view to the preliminary formulation of a 

TABLE 8. The derivation of equivalent stimulus trace intensities (S) on Reynolds’ 

subsident stimulus trace gradient {17), on the basic assumption that at the maximum, 

^ must equal approximately 1000.0 units of stimulus intensity. 

Agcoftracc(0 . 250" .450" 1.150" 2.250" 

Per cent overt responses 68 98 70 31 

Corresponding reaction potentials 3.0435 4.6295 3.1002 2.0799 

Stimulus-intensity dynamism (Vi) .952155 .637622 .427776 

Calculated equivalent stimulus trace 

intensity (S') 1000.8 10.048 3.558 

quantified postulate. The first step is the conversion of these per 
cent responses into c values. Using the ordinary probability table 
we secure the equivalent reaction potential values, shown in the 
third line of Table 8, which presumably represent approximations 
to corresponding bEh’s. These values appear graphically as part of 
Figure 1 of Essentials of BehavioT. Of the four bEr values we are 
especially concerned with the final three, which represent approxi- 
mations to the reaction potentials occurring at .450", 1.150", and 
2.250"; i.e., on the falling or subsident range of the stimulus trace. 
Now, these reaction potentials are evidently set up under very 
different trace values (Vj) at the several delay times, though the 
evocations are presumably in the same general temporal region. 
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Unforlunately there are only three pairs of such values. But in an 
emergency we do the best we can with what is available. At the 
same time we change S to S' because there is no actual stimulus 
intensity here, and because the stimulus trace is in its subsident 
phase. The resulting fitted equation is: 

S' = 6.881 (t' + .01128)-‘”"<. (45) 

There remains the question of the recruitment gradient charac- 
teristics of S'. Kimble (72) has published some empirical results 
in this range more or less comparable to those of Reynolds. Kimble’s 
results are given in the first, second, and third lines of Table 9. 

T A B L E 9. The derivation of the equivalent recruitment stimulus trace intensities (S') 
from Kimble’s stimulus trace gradient {12). 


Age of trace (t) 

Per cent overt re- 

.100" 

.200" 

.225" 

.250" 

.300" 

.400" 

sponses 

Corresponding re- 

45 

51 

54 

77 

87 

95 

action potentials 
Stimulus-intensity 

2.4501 

2.6009 

2.6762 

3.3146 

3.7022 

4.2207 

dynamism (VO 
Equivalent stim- 
ulus intensities 

.50392 

.53493 

.55042 

681721 

.76144 

.86808 

(feO 

4.92 

5.70 

6.15 

13.49 

25.98 

99.83 


From these values are calculated the equivalent Vi values and the 
presumably corresponding S' values shown respectively in the 
fourth and fifth lines of the table. Fitting an equation to the pairs 
of values in the first and last rows of Table 8, we have; 

S' = 9967.6f + 3.0. (46) 

Freely adapting equations 46 and 45 to each other to eliminate 
inconsistencies presumably due to limitations in the size of the 
empirical samples and various artifacts (since they must be identical 
at their maximum point), we arrive at tentative equations 1 and 2: 

S' = 465,1901^-''®” + 1.0; (1) 

S' = 6.9310(j' + .01)-*"®®; (2) 

which form the basis for Postulate II, parts A and B respectively 
(p. 5). It will be observed that these equations are so written 
that the maximum equivalent stimulus intensity (S') amounts to a 
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This means that the V 2 value for the responses conditioned in this 
region must be approximately constant at t = .45". 

Next we proceed to estimate from the bEr values, as they stand, 
the corresponding Vi’s of the learning process, which are carefully 
to be distinguished from the evocation processes (V 2 ’s). By equation 


sEr = D X Vj X K X b.Hr X Vi (8') 

We now shall assume that D X Vs X K X ..H* = 4.862128.' It 

a Table 8 are divided by 

4.862128 the quotients will be V,’s. Performing these divisions 
i?** ^ values Reynolds found in his subsident gradient, we 

w of Table 8. 

tim *■ ™ ““ available we may substitute them one at a 

time m equation 6, 

Vi = 1 - (g) 

Substituting the first V. value of the fourth row, Table 8, we have. 


n 1 • r o -^52155 = 1 _ 10-<I1..S 
Solving for S, 

.047845 = 10 “« t <'«8 

10 — 1 

“ : 047 P 5 “ 20.9008 
•44 log S = log 20.9008 = 1.32016 
log S — E32016 
^ : 44 ' = 3.00036 

i"n'';hrrl"wr,rca"eS‘r" "‘her two values 

manner. ^ means of equation 6 in a similar 

With a sufficient number of ^ i 

sible to fit an equation to t>i available it should be pos- 

to the corresponding t' values where, 

= t - .450". 

'The 4 862128 waj a value d^iii, 

mtemity at , - . 45 ". '“^-"‘^ychcen to yidd Bbout 1000.0 unit, of ,timulu* 
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conventional 1000 units. This means that real light units (milli- 
lamberts) would sometimes greatly exceed that amount. 

The necessarily indirect nature of the above determinations of 
equations 1 and 2 should give the reader a realistic comprehension 
of the two major aspects of the molar stimulus trace. 

Reaction Potential (bEr) on the Stimulus Trace as a Generalization Continuum 
The next step in our analysis will be to determine in a preliminary 
manner the theoretical stimulus generalization characteristics of 
reaction potential (sEb) throughout the molar stimulus trace. 
With equations 1, 2, and 6, and those of Postulate X B available, 
we are now able to calculate this throughout both phases, given the 
point of reinforcement. As in the first empirical case considered, we 
of reinforcement occurs at the optimum 
Wr firlf ’ 1 of the conditioned stimulus. 

Stimulus fra equivalent stimulus intensity (S') of the 

subsident nhas^'T^"^°^ equations 1 and 2. The values for the 
of numbers in T°hi' ' ure shown in the second line 

both trace phases!' A- generalized ,Hb throughout 

maximum (1 00) at ^hit strength has reached its 

!' = 0"), from Jhirh ■ '’'’"1! "‘"foecement (t = .45", i.e., 

X B that, ' generalizes, we find by means of Postulate 

•.Hb = ,,H, X lO- iM, 

stimulus (Si) and that ^°S®eithm of the conditioned 

values for the subsident phase'orl^'^’"® 

row of Table 10. trace are shown in the fourth 

Finally, we substitute in Postulate XB. i.e.. 

= ('■«« X Ve)(D X K X V.) 

A^ummgjhat (D x K X V ^ ^ o 

•95214, 8, Hr as shown in lin#***! •®*^2128, Vi in this case being 

values of b,Er as 3, we have the 

ese values represent superthreshold 
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Some Tentative Theorems Regarding the Stimulus Trace 

The examination of Postulate II and Figure 31, together with the 
considerations of the preceding section, gives rise to the following 
generalizations: 

THEOREM 18 A. The stimulus trace (j') reaches a maximum at 
about .45(y'. 

B. Both the recruitment phase and the subsident phase of the stimulus 
trace are power Junctions of time. 

C. The subsident phase of the stimulus trace has a much longer 
duration than the recruitment phase. 

D. The recruitment phase of the stimulus trace must always alternate 
in occurrence with the subsident phase. 

Turning to the heavy graph of Figure 31, we observe that ac- 
cording to the present theory when reinforcement occurs at the 
point of maximum intensity on the stimulus trace (t » .AS”) the 
following generalizations may be made: 

THEOREM 19 A. The recruitment phase of the generalized supers 
threshold reaction potential (sEn), when reinforcement occurs about 
.43" after stimulation, rises steeply in a slightly concave-upward 
manner to the point of reinforcement^ after which it falls sharply but 
much more slowly in a decelerated manner toward a zero value. 

B. The recruitment phase of bEr rises appreciably above the reaction 
threshold at about .15 seconds after stimulation. 

C. The recruitment phase and the subsident phase of reaction po- 
tential pass through exactly the same sequence of reaction potentials on 
each stimulus occasion but in a reverse order and at a markedly different 
rate. 

Our next principle follows from 1 9 B and C, and from Postulate 
X; 

THEOREM 20 A. Reaction by generalized evocation must take 
place between relatively similar stimulus intensities of the two phases 
of the stimulus trace, particularly from the subsident phase {condition- 
ing) to the recruitment phase {subsequent evocation), and from a part 
of one phase to a different part of the same phase. 

But since the superthreshold reaction potential associated with a 
portion of the recruitment phase of such stimulus traces must 
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reaction potentials, i.e., they are based on actual responses. Glad- 
stone et al (2) have shown with rats that the threshold stands .426<r 
above the absolute zero of reaction potential (Z). Values cor- 
responding to row 5 were calculated for the recruitment phase of the 
stimulus trace (t = 0" to .45") by strictly analogous procedures. 
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It will be noticed that the reaction potential reinforced at 
t' = 1.5" on the subsident trace, when generalized from 1.5" to 0" 
is much greater than when generalized from 1.5" to 6.0". General- 
izing from this consideration, we arrive at part C of Theorem 21 : 

C. Generalized reaction potential for comparable time deviations from 
the point of reinforcement toward the maximum stimulus intensity on 
the subsident phase of a stimulus trace is definitely greater than when 
generalization from the same point of reinforcement occurs on the weaker 
extreme of the same stimulus trace. 

But sometimes the manipulandum as well as much of the external 
stimulus pattern does not become available to the subject until a 
point has been reached on the stimulus trace later than the point 
of reinforcement. From this, together with a consideration of both 
graphs in Figure 31, we have: 

D. On the subsident tracts the farther down below any given point 
of reinforcement the response is evoked., the weaker the response will he. 

But if the phases of the stimulus trace posterior to the point of 
reinforcement (s,^) are able to evoke reactions, as is implied by 
Theorem 20 C, does this mean that the reaction will be repeated 
continuously as long as the stimulus trace retains any appreciable 
strength? In considering this question it should be recalled that (1) 
when a stimulus trace evokes a reaction, proprioceptive and other 
stimuli impinging on the various receptors as the result of the 
reaction may be expected by the principle of afferent interaction 
(XI) to change somewhat the nature of the stimulus trace, thereby 
reducing that habit’s reaction evocation powers; (2) the In pro- 
duced by the act itself will tend for a short time to inhibit the act; 
and (3) in the case of successful flight reactions the evocation of the 
act and the consequent primary reinforcement of its connection 
with the trace of the proprioceptive stimuli arising from a preceding 
flight evocation will rarely or never occur. 

Generalizing from these considerations, we arrive at our twenty- 
second theorem: 

THEOREM 22. If one phase of a perseverative stimulus trace (s') 
evokes a reaction (R), a subsequent phase of that ihdividual trace 
will not be likely to do so unless the conditions of the situation are 
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always precede any superthreshold portion of the subsident sec- 
tion, it follows that: 

B. There will be a strong tendency for connections set up 

during both phases of the stimidus trace to be evoked by an earlier 
portion of a subsequently occurring recruitment phase. 

When a response is evoked earlier in the stimulus sequence than 
e point corresponding to the circumstances of its original rein- 
forcement, we have a case of antedating reaction. Now when a response 
an htl'' “"dhions of its reinforcement, the subject is said 
^ reinforcement. This has 

f^Xws LmThXrr 20 B ‘th 

underlvinor tiY • r ^ ^ objective principle 

mally LrLd aT been for- 

the notion of •'expectan°cv”"X^‘“* Therefore, 

(20) rather than as a primaryX a?"°'^ 

The essence nf n f ^ ‘s sometimes assumed.’ 

.O. chanced to hlve'aXw yllue'at”th” ^ 

conditioned to a re;n„ c * *bo occurrence of a stimulus 

responseXw .aleXace'?’, ^ *e 

forcement and presumably wVuId^J-**^ conditions of original rein- 
follows that: ^ occasionally do so. Therefore it 

ditioned to the recruitm^^^ ^^obability) for responses con- 
'-.-.a ^ 

turning now to the 'r^ 

graph in Figure 31, JointlXlthX*''’" ‘b'= “ore lightly plotted 
’■mve at the following generateatbX''' 

■theorem 21 A. Jt^hen 

portion of a stimll,^'! Torw«/ion is set up on the 

the reaction potential when Maximum trace intensity, 

{other things equal) is weak at the point reinforced 

° ^^°oe poiru ” ^oinforcement and evocation 

forcement occurs (o/Afr tHn^^ ^ stimulus trace that rein- 
evocation is likely to antedate i^r subsequent response 

* S« Chapter 5, especially ^ the conditions of its reinforcement. 

terminal note. 
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dramatic in its action than the avoidance of tissue injury, applies 
both to approach or adient reactions (e.g., seizure of prey) and to 
flight or abient reactions. 

But where a continuing stimulus operates there must be a com- 
petition among the responses conditioned to it. It follows from 
Theorem 20 B that in this competition the responses evoked by the 
recruitment phase of the trace will tend to eliminate by a “short- 
circuit” responses which were originally in the behavior sequence, 
because there will not remain in the series any proprioception 
adequate to evoke the chain in question. Therefore we have part B 
of Theorem 23: 

B. Organisms capable of having their reactions reinforced to stimulus 
traces will manifest the phenomenon of short-circuiting*^ behavior 
sequences (5, pp. 520, 522). 

The theory of the short-circuiting of reactions will be taken up 
in considerable detail in a subsequent chapter (see pp. 278 fF.). 
But before we leave this subject even temporarily wc must point 
out that the stimulus trace is by no means the only mechanism 
which mediates adaptively antedating reactions. One of the more 
obvious of the additional mechanisms is found in the persisting of 
external stimuli, such as the stimuli arising from the apparatus in 
conditioning situations. This mechanism brings about antedating 
reactions in a manner even more obvious than that of the persevera- 
tive stimulus trace (7, p. 74). Still a third important mechanism 
which mediates antedating reactions is the persisting internal stim- 
ulus associated with a continuing, though diminishing, need (Sd); 
the practical outcome in this case is substantially the same as that 
of the continuing external stimulus. 

The Dilemma of the Conditioned Defense Reoction 

To superficial view the conditioned defense reaction leads to a kind 
of biological paradox (4, p. 509), In the case of a response leading 
to food consumption, the act will always be followed by reinforce- 
ment and so the reaction potential will be kept up to full strength. 
In this respect the conditioned defense reaction differs radically. 
As pointed out above, for a conditioned defense reaction to be 
wholly successful the movement must occur so early in the sequence 
that the organism will completely escape injury. 
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such as directly to condition the repetition to that phase along with the 
proprioceptive stimulus traces left by a preceding reaction of the same 
kind. 

The Adaptive Significance of Antedating Reactions Mediated by the 
Perseverative Stimulus Trace 

The occurrence of events in the external environment to which the 
organism must react in order to survive is lawful. Thus if event B 
follows event A under certain circumstances on one occasion, it 
will do so on subsequent occasions of the same kind. Now, if event 
A activates some of an organism’s receptors as it occurs, and event 
t roug its stimuli evokes a reaction which reduces a need created 
y event , this reaction will be conditioned to the perseverative 
s imulus traces left In the organism by event A. After the condition- 
c'* ^^■■«“tion potential ( ,£„) to a value greater 
R will bev^"^ r” “ we have seen above, reaction 

Quence in^rf ■ ° "'ginal occurrence in the event-se- 

B is the 1111111"!^ ' organism and its environment. In case event 
thattvne and ‘ojury or other need of 

of flight or withd ® defense reaction taking the form 

the withdrawal re'arf* "'‘8'’horhood, it is clear that if 

injurious situation fli'°" before the occurrence of the 

inju7and Z amedar''''"""'.''“' 

defense reaction In *"i?i have been an effective 

night is frequently ve^dLnabt'lh '' driT"^ “gonisrrs, where 
made possible by the antndat' ’ “^ded time for such flight 
obviously be of imm#. conditioned trace reaction must 

adapt Jmecha;“rtaXT'"'^ 

siderations we arrive at f a r From these con- 

^ of our twenty-third theorem: 

theorem 23 A O 

conditioned to terr^v^ capable of having their reactions 

On the side of '• ' ’ ^neorem VIII). 

"rod, it is generally trueThIrfi!'°"i^™*' produced by food 

the better is the organism’s h ^ shorter the duration of a need, 
nism which hastens the atta^ survival. Thus any mecha- 

favors survival. This princinll^'of'* ? ^ things equal, 

P of adaptive dynamics, while less 
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in some detail, particularly as related to reactions conditioned to 
perseverativc stimulus traces. 

Let it be supposed that a reaction (R) has been conditioned to 
the maximum intensity of 4(t to a perseverativc stimulus trace (s') 
which has an age of 2"; that all reactions occurring earlier than 2" 
are unreinforced (7, p. 258); and that all reactions occurring later 
than 2" will be reinforced. The unreinforced reactions will generate 
extinction effects where t < 2" (7, pp. 277 ff.), which will generalize 
upon the excitatory tendencies produced by the reinforcements 
where t > 2" (7, p. 264). It is evident from these considerations 
that we have here essentially a case of separate-presentation simple 
discrimination learning on the dimension of the perseverativc 
stimulus trace. 

Generalizing on the above considerations, we arrive at parts A 
and B of our twenty-fifth theorem: 

THEOREM 25 A, If tAe antedating reactions evoked during the 
setting up oj trace conditioned reactions consistently are not followed 
fy reinforcement even at the usual point of reinforcement, this phase of 
the perseverativc stimulus trace will gradually cease evoking reactions, 
with the result that later phases of the trace will be free to evoke the 
reaction even though the response will be somewhat weakened; this will 
therefore be a true delayed trace reaction. 

B. That phase of the perseverative stimulus trace which is subjected 
to extinction during the differential reinforcement involved in the setting 
up of a delayed conditioned trace reaction will acquire conditioned in- 
hibitory characteristics (7, p. 281), resulting in the so-called ^inhibi- 
tion of delay'; and reactions normally evoked by other stimuli will 
suffer interference if these latter stimuli act during this period of delay. 

If a delayed conditioned trace reaction has been established, a 
certain amount of inhibition will be set up at that phase of the 
stimulus trace at which thenon-reinforced reactions have occurred, 
the rise of this inhibition will be gradual, and so the amount of 
delay in reaction will increase gradually. This gives rise to part C 
of our theorem: 

G. Where a delay in the evocation of conditioned trace reactions is 
being set up, the amount of the delay will increase gradually. 
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But in case the organism succeeded in escaping the injury there 
would be no cessation of pain to serve as a reinforcing agent, and 
repeated reaction without reinforcement would generate experi- 
mental extinction (IX D). The consequent cessation or retardation 
o t e reaction due to this weakening would cause a subsequent 
recurrence of the injury. This in turn would initiate a second 
cycle substantially like the first, which would be followed by others, 
a senes o successful escapes always alternating with a series of 
injures, s an adaptive mechanism such an arrangement clearly 
would not represent a very high degree of efficiency, 
truth ^ ^ ^ biological dilemma sketched above is only a half 
mav tend. h ^ organisms under the assumed conditions 
ever these" of ‘bis type. Ordinarily, how- 

produced hvT recurring indefinitely. The stimuli 

ment (ii 701 Part, i ***' power of secondary reinforce- 

““ einfor emem ' ‘1"“^ 'he power of second- 
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V, p. 101). ” “ff h“uome Junctionally autonomous 
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to the tempo of events in the* synchronizing reactions 

habits involved will receive a way that the 

Vc have already seen that amount of reinforcement, 

reaction. There are situations requires very prompt 

tion will be unadaptive i e when a too prompt reac- 

rcaction if it is to be reinforcU demand a delay in the 

e roust now consider this problem 
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and by North in the present connection (75, p. 443), as the reversals 
continue the two short-trace habits and their opposing extinctions all 
tend to become of maximal strength according to the present 
theory. But as the short-trace habits and their extinctions approach 
their maximum strengths the net aEn’s of the respective turns will 
approach zero and the number of trials required to yield a given 
advantage to either short-trace habit after reversal must increase 
indefinitely. We are accordingly forced to dismiss the role of 
ordinary short-trace stimuli as involved in a possible explanation of 
reversal learning. 

As stated above, it is not true that all stimulus trace combinations 
are reversed each day. A little thought on the reader’s part will 
show him that in this experiment there are two combinations of 
long stimulus traces, one or the other of which if followed by a 
right-hand choice, say, always will be reinforced, and an analogous 
pair of relations always holding for the left-hand choice. These four 
specific cases are all shown in some detail in Table II. Under the 

TABLE 11, The combinations (1) of long stimulus trace conditions and (2) of responses 
which are always reinforced under the reversal assumptions stated in the text. The 
cases in Roman numerals at the left apply only to the trace combinations in the 
middle columns. 


Case 

One-minute stimulus-trace combination 

Always reinforced if 
followed by the act of: 

1 

Right-hand turn + food (reinforcement) 

a right-hand turn 

II 

Left-hand turn -f no food (frustration) 

a right-hand turn 

III 

Left-hand turn + food (rdnforccmcnt) 

a left-hand turn 

IV 

Right-hand turn + no food (frustration) 

a left-hand turn 


present assumptions cases I or 11 occur on days 1, 3, 5, etc., and 
cases III or IV occur on days 2, 4, 6, etc.; cases I and III involve 
the perseverational stimuli resulting from reinforcement, and cases 
11 and IV involve the perseveration stimuli resulting from rein- 
forcement failures or frustrations. 

Next, the reader will clearly understand that two additional 
principles must operate if tWs learning reversal is to succeed. The 
first principle is that the patterning of these stimulus traces depends 
on afferent interaction (XI). For example, the one-minute trace 
of a right turn in the maze coupled with the receipt of food (case I) 
is very different from the one-minute trace of a right turn coupled 
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^ In c^e the trials are massed, a good deal of spontaneously 
disslpatlble inhibition will be involved in the delay. From this 
follows part D of Theorem 25: 

D. ^ an appreciable pause follows the selling up of a delay on a con- 
tlioned persever alive slimulus trace by massed trials^ a test subsequent 
to the pause will show the delay to have partially or wholly disappeared. 

‘he present system imply that if the primary 
^PP^cia y increased this will tend to over-ride moderate 
known'tn'h' analogous effects are 

cSel <^=>ffeine citrate. From these 

constderanons we arrive at our last two parts of Theorem 25: 

^iUbettlTr " trace has been achieved, it 

p. 127)”^ masked by an increase in the primary drive 

I Usse^^dV"^ trace has been achieved, it will 

t^h as caffeine and 

" 

known experimentafw'^for'^°'"***'* learning which has been 

theory of whkh J;u‘h uL hut concerning the 

albino rats are MnS 'tthts. Let it be supposed that 

hand alley for ten trials at food-reward in the right- 

then food reward in the left ‘"’"vals on the first day, 

intervals on the second dav ‘"“h at one-minute 

bay for twenty or more daJs"*M ™ training reversed each 

subjects. Several cxpcrimenis r tthoices are permitted the 

ported (J4; 75). They revea! type have been re- 

erroneous choices in each blneV decreasing number of 

animal will perfect the dlserim” ‘"“hi after reversal. Sometimes an 
a ter reversal, that on the first iriaT'^ ®h°'«ng only one error 
y^ learning be explained? ' ‘h’^ degree of reversal 

One reason why reversal !« ■ 

sight it appears that all the bonrir""® PfazHng is that at first 
next, and so on indcfinitelv day are reversed the 

plained above in detail in a o ‘ true. Moreover, as ex- 
qaite different connection (p. 67), 
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primarily to the trace of a tactual-vibratory stimulus lasting only 
.18", which was followed after an interval of 17.4" by a reinforcing 
shock also lasting .18". The modal latency of this rather sluggish 
response early in the training was 3.0", though the distribution 
skewed off to the long latency side even up to 17.0". The typical 
distribution of these latencies shows experimentally that almost all 
of the responses antedated the conditions of reinforcement. This 
may be seen in Figure 32. Thus Theorem 20, parts A and B, appears 
to find empirical verification. 



P I o u R E 32. The distribution of the first third of the response latencies from five sub- 
jects based on what purports to be a trace conditioned galvanic skin reflex conditioned 
to the trace 17.4 seconds after its initiation. Note the antedating nature of the reactions 
and their distinctl)' skewed distribution. Adapted from Rodnick [ 18 , p. 423). 


In this connection we may note that at one time it was customary 
for psychologists working on the phenomena of rote learning (13, 
pp. 70 ff.) to speak of antedating reactions as remote forward associa- 
tions, and of perseverative reactions as remote backward associations. 
This was before the Pavlovian concepts of stimulus traces and 
stimulus generalization on stimulus-trace continua as here used 
were introduced into the theory of rote and compound trial-and- 
error learning, which reduced both types of ‘‘remote” associations 
Substantially to the generalization of simultaneous association. 
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with no food (case IV). The second principle operating here is 
that of discrimination learning (Chapter 3). For 
iect must be able to distinguish the implication of a short (<: } 
mace of a right turn from that of a long {!') trace of a right turn; 
and to distinguish a long trace of a turn to the right coupled with 
food reinforcement from a long traee of a turn to the right coupled 
with the frustration of no food. These rather subtle differences are 
difficult for non-speaking animals to respond to successfully, and 
it is no wonder that rats and even apes learn the reversals slowly. 

Generalizing on the preceding considerations, we arrive at our 
twenty-sixth theorem; 


THEOREM 26. OTganisms possessing normal afferent interaction and 
perseverative trace mechanisms and the usual mammalian discrimina- 
tion powers will learn to reverse learned responses with a progressive 
saving of errors, with the initial error at a series of trials as the lower 
limit {cases II and IV). 


The Experimental Validity of the Preceding Theorems 

Turning now to the question of the general empirical validity of 
the above theorems, we find that neurophysiology has not produced 
any relevant direct evidence regarding Theorem 18. However, 
there is considerable empirical data available regarding most of the 
other theorems. 

The work of Wolfle (22), Reynolds (77), and Kimble (72) sup- 
ports the general view that learning is optimal when reinforcement 
follows the conditioned stimulus by a short half second, and that 
the gradients at each side proceed by a concave-upward curvature. 
This fact tends to substantiate Theorem 19 A. 

The approximate soundness of Theorem 19 B is shown by the 
minimum latency value of human reaction to visual stimuli if a 
little is added to the latency to provide for the time consumed by the 
physiological response itself. The truth of Theorem 19 G appears 
to be almost self-evident, even though the relationship seems never 
before to have been pointed out. 

^ The strong tendency of learned responses to antedate the condi- 
Uons under which reinforcement occurred has long been known. 
Rodnick {18) reports what purported to be a galvanic skin condi- 
Uoncd-reflex experiment; in tWs the reaction was conditioned 
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primarily to the trace of a tactual-vibratory stimulus lasting only 
.18", which was followed after an interval of 17.4" by a reinforcing 
shock also lasting .18". The modal latency of this rather sluggish 
response early in the training was 3.0", though the distribution 
skewed off to the long latency side even up to 17.0". The typical 
distribution of these latencies shows experimentally that almost all 
of the responses antedated the conditions of reinforcement. This 
may be seen in Figure 32. Thus Theorem 20, parts A and B, appears 
to find empirical verification. 
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FIGURE 32. The distribution of the first third of the response latencies from five sub- 
jects based on what purports to be a trace conditioned galvanic skin reflex conditioned 
to the trace 17.4 seconds after its initiation. Note the antedating nature of the reactions 
and their distinctly skewed distribution. Adapted from Rodnick ( 78 , p. 423). 

In this connection we may note that at one time it was customary 
for psychologists working on the phenomena of rote learning (73, 
pp. 70 ff.) to speak of antedating reactions as remote forward associa- 
tions, and of perseverative reactions as remote backward associations. 
This was before the Pavlovian concepts of stimulus traces and 
stimulus generalization on stimulus-trace continua as here used 
were introduced into the theory of rote and compound trial-and- 
error learning, which reduced both types of “remote” associations 
substantially to the generalization of simultaneous association. 
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FIGURE 33. Mean frequency of errors In the original learning and on each of numer- 
ous reversals on a T-maze. Reproduced by permission of A. J. North (K). 


In a somewhat simUar manner Theorem 21 D h roughly sub- 
stantiated by Czehura’s results, which showed a falling concavc- 
upward shape (7). His “traces,” however, sometimes seemed to last 
for about a half-minute, and this suggests that some internal motor 
response may have been persisting there instead of the trace as 
ordinarily conceived. This may also explain why the logarithmic 
curvature of Czehura’s gradients does not agree with the present 
theory. 

The fact that ordinarily defense reactions are not repeated in a 
clonic manner unless the type of reinforcement which has been 
received involves the proprioception of previous responses tends to 
substantiate Theorem 22. 
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The confirmation of Theorem 23, parts A and B, is found in 
everyday observation, which demonstrates its importance (5). 
Theorem 24 definitely needs a precise experiment to determine its 
validity in detail. 

Theorem 25 A is a well authenticated experimental phenomenon 
reported by Pavlov {16, p. 93). Theorem 25 B is somewhat equi- 
vocally substantiated by an experiment reported by Rodnick (7P). 
Theorem 25 C appears to be fully substantiated by an experiment 
reported by Rodnick {18), both for delayed and for trace condi- 
tioned reflexes. Theorem 25 E is well substantiated by Pavlov {16, 
p. 127). Theorem 25 F was empirically validated regarding the 
action of caffeine, by Switzer (27). 

As already indicated, the fact that chimpanzees {14) and albino 
rats (75) can learn, though with difficulty, to discriminate repeated 
training reversals presents empirical verification of Theorem 26. 
This evidence is represented graphically in Figure 33. It is note- 
worthy, however, that the learning does not become perceptible 
until the fourth or fifth reversal. 

Summary 

Despite the fact that the molar stimulus trace is a theoretical con- 
struct, a fairly clear but tentative quantitative postulate regarding 
its nature has been derived from a few relevant learning data. 
From this postulate and in association mainly with the principles 
of stimulus-intensity dynamism and stimulus generalization there 
have been derived theoretically a number of important behavioral 
principles, most of which have long been known. Even so, without 
their theoretical derivation they could hardly be considered as 
fully understood. 

Perhaps the most significant principle peculiarly dependent upon 
the stimulus trace is the fact that newly learned stimulus-response 
connections under ordinary circumstances almost invariably evoke 
the response in advance of the conditions of the reinforcement. 
This is a valuable biological device in defense reactions and in the 
seizure of prey. In addition, the movements which originally 
occurred in the interval thus shortened arc dropped out of the 
behavior sequence, i.c., they are short-circuited. This Is adaptive 
in that it reduces useless energy consumption. 

On the other hand, if reinforcement is given by massed repetitions 
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The fact that ordinarily defense reactions are not repeated in a 
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received involves the proprioception of previous responses tends to 
substantiate Theorem 22. 
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The confirmation of Theorem 23, parts A and B, is found in 
everyday observation, which demonstrates its importance (5). 
Theorem 24 definitely needs a precise experiment to determine its 
validity in detail. 

Theorem 25 A is a well authenticated experimental phenomenon 
reported by Pavlov (7d, p. 93). Theorem 25 B is somewhat equi- 
vocally substantiated by an experiment reported by Rodnick {19), 
Theorem 25 C appears to be fully substantiated by an experiment 
reported by Rodnick (75), both for delayed and for trace condi- 
tioned reflexes. Theorem 25 E is well substantiated by Pavlov {16, 
p. 127). Theorem 25 F was empirically validated regarding the 
action of caffeine, by Switzer {21). 

As already indicated, the fact that chimpanzees {14) and albino 
rats (75) can learn, though with difficulty, to discriminate repeated 
training reversals presents empirical verification of Theorem 26. 
This evidence is represented graphically in Figure 33. It is note- 
worthy, however, that the learning does not become perceptible 
until the fourth or fifth reversal. 

Summary 

Despite the fact that the molar stimulus trace is a theoretical con- 
struct, a fairly clear but tentative quantitative postulate regarding 
its nature has been derived from a few relevant learning data. 
From this postulate and in association mainly with the principles 
of stimulus-intensity dynamism and stimulus generalization there 
have been derived theoretically a number of important behavioral 
principles, most of which have long been known. Even so, without 
their theoretical derivation they could hardly be considered as 
fully understood. 

Perhaps the most significant principle peculiarly dependent upon 
the stimulus trace is the fact that newly learned stimulus-response 
connections under ordinary circumstances almost invariably evoke 
the response in advance of the conditions of the reinforcement. 
This is a valuable biological device in defense reactions and in the 
seizure of prey. In addition, the movements which originally 
occurred in the interval thus shortened are dropped out of the 
behavior sequence, i.e., they are short-circuited. This is adaptive 
in that it reduces useless energy consumption. 

On the other hand, if reinforcement is given by massed repetitions 
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and continued for a considerable length of time with an appreciable 
and constant delay, inhibition (!» and ,1,) will develop which will 
produce a slowing of the reaction latency, Rodnick found that this 
retardation of the responses in trace galvanic reactions was very 
much slower m being brought about than in an experiment where 
the sumulus Itself was continuous, i.e., in the delayed conditioned 
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Apparently the answer lies largely in the role played by the per- 
severative stimulus traces as presented in the above chapter. In 
the case of group a the response is conditioned to various after- 
effects of reinforcements, both immediate and remote. In the case 
of group h the response is conditioned to the mixed after-effects of 
both reinforcements and non-reinforcements, and there will be a 
tendency for the reinforcements actually to set up connections 
between the responses and 30-second traces of non-reinforcements 
themselves. This will naturally oppose the extinction effects of the 
non-reinforcements interposed among the genuine reinforcements. 

Now, consider the extinction process proper: In group a it pre- 
sumably will result in part from the loss of the trace effects of the 
regular reinforcements which are now replaced by the traces of 
non-reinforcements. In group A, however, the presence of non- 
reinforcement traces during extinction will not create a very radical 
change (loss) from the original conditioning; indeed the traces of 
non-reinforcement, being so closely associated with genuine rein- 
forcements, probably have a mild secondary reinforcing power. 
On two counts, then, group h with its 50 per cent reinforcement 
should resist experimental extinction better than group a with its 
100 per cent reinforcement. 

On the other hand, if the conditioning of groups a and b is by 
trials separated by intervals long enough to permit the traces of 
the non-reinforcements to dissipate before the next following 
reinforcement takes place, the latter will not connect the non- 
reinforcement trace with the response. As a result we should not 
expect group h to resist extinction better than group a. 

It may be added that since the differential reinforcement utilized 
in discrimination experiments may involve 50 per cent (or 12^) 
reinforcement exclusively, it follows that if this process is to be 
effective, the intervals between the trials should be long enough to 
permit the after-effects of the non-reinforcements largely to 
dissipate. 
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5. Fractional Antedating Goal Reactions 


We have shown that stimulus traces presumably serve as generaliz- 
ing agents, and as such should strongly tend to evoke reactions 
antedating the conditions of the original action occurrence (Theo- 
rem 20, p. 107). We must now note two further related but rather 
different mechanisms which perform a somewhat similar function 
of generalization continua—namcly, (1) the drive stimulus, So 
(5, pp, 519 ff.), and (2) the sequence of external stimuli. Si, S 2 , S3, 
and so on. 

Preliminary Considerations Regarding the Antedoting Goal Reaction 

Consider an organism which is presented with a sequence or chain 
of external stimuli, Si, S2, S3, S^, and Sa, and which makes a 
sequence of responses, Ri, Rj, Rj, R 4 , and Ro, where So is the food 
stimulus and Rq is the consummatory response, e.g., that of eating. 
The organism is assumed to be hungry, so that Sd^ will accompany 
the Ro, or eating response ( 6 , pp. 495 ff.). Now, according to 
Postulate III, a goal response such as eating is assumed to initiate 
and accompany the process of diminishing So which produces 
reinforcement. 

The preceding considerations show that Ro may be reinforced 
to the persisting So ((5, p. 487) and to the rather differently persist- 
ing traces of Si, S2, S 3 , S4, and Sq. It follows that on a repetition 
of this sequence there will be a tendency for Sd, together with the 
traces of Si, Sj, S 3 , and so on, to evoke Rq at the outset of the se- 

124 
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quence and more or less continuously throughout it except in so 
far as there may be a conflict between Ra and the necessary instru- 
mental movements of the sequence, such as Ri, R2, R3, and R4. 
Presumably in any such situation the instrumental acts would 
dominate the conflicting portion of the antedating generalized act, 
permitting the non-conflicting or fractional portion to persist in a 
covert form. We shall represent this non-conflicting part of Ro by 
ro. For this reason such assumed persistence (ro) is called the frac- 
tional antedating goal reaction. 

There are several points to observe in regard to this fractional 
antedating goal-reaction concept The first thing is that its origin 
requires Tq to be intimately associated with the attainment of goals, 
i.e., with reinforcements. But the association with reinforcements is 
known to set up the power of secondary reinforcements (ii). 

Generalizing on the above considerations, we arrive at Corollary 
XV (already given in Chapter 1): 

Corollary xv 

When a stimulus (S) or a stimulus trace (s) acts at the same time that 
a hitherto unrelated response (R) occurs and this coincidence is accom- 
panied by an antedating goal reaction (ro), the secondary reinforcing 
powers of the stimulus evoked by the latter (so) will reinforce S to R, 
giving rise to a new S R dynamic connection. 

The second point to observe in regard to ra is that such a response 
presumably produces a continuous stimulus which is characteristic 
of the consumption of the goal substance (K) throughout the 
behavior series of which it is a part. We shall call this the fractional 
goal stimulus (so). A third point is that the different drive stimuli 
(Sd) differentiate the various needs and guide the behavior to their 
realization to some extent, but the goal stimuli (so), because of 
their infinite variety, constitute a wealth of stimuli leading to such 
guidance. A fourth important characteristic of such fractional 
antedating goal reactions is that in the behavior sequence here 
assumed, when the fractional goal reaction is evoked at S4 it is 
called forth by generalization on four traces (from Si, Sj, Sj, and 
S4); at Sj it is evoked by generalization on three traces (from Sj, Si, 
and Sj); at S2 it is evoked by generalization on two traces (Si and 
S2); and at Sx it is evoked by generalization on only one trace (Si). 
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Reaction Potential os o Function (J) of the Deloy in Receiving the Incentive 
When stimuli such as Ss, Ss, Ss. and S. are consistently followed 
by a reinforcing state of affairs (Ro). them traces w.ll acquire the 
capacity to evoke Ta with its secondary reinforcing powers. This 
implies that S, will acquire the power to evoke Ri, i.e., Si -> Ki, 
S, -.Rs, Ss --.Rs, S. --.Ri, and S„-vR„. But, as pointed out 
above (p. 125), the ro decreases in strength from S 4 to Si, m that 
order, which presumably means that its reinforcing power decreases 
from Si to Si. It follows that So -♦ Ro, Si R*, Sj Ra, S 2 R 2 , 
and Si Ri will decrease in strength from the point of reinforce- 
ment backward to the beginning of the sequence. Now, the general- 
ization on a subsident trace toward its maximum intensity is 
reVatively horizontal on the whole, but with a sharp reduction as 
the maximum trace is approached (Figure 31, Chapter 4). But this 
progressive decrease of the relatively horizontal generalization 
effect in question will produce a gradient which slopes down- 
ward in intensity from Sa to Si with a marked concave-upward 
curvature. This is the substance of the principle of the gradient 0 / 
reinforeement within a given reaction chain (type A) (S, p. 135). 

Generalizing from the preceding considerations we arrive at part 
A of Corollary iii (already given in Chapter 1): 


CoTollary in 

A. The greater the delay in reinforcement of a link within a given 
behavior chain, the weaker will be the resulting reaction potential of 
the link in question to the stimulus traces present at the time. 


Corollary iii A shows that the delay in reinforcement within a 
continuous sequence of reactions tends to generate a chain of 
responses, which manifests a progressively diminishing reaction 
potential from the point of reinforcement back to the link farthest 
removed from the reception of the incentive. We must now note 
that there is a case roughly parallel to this in which (1) but a single 
S R connection is involved, (2) the delay in primary reinforce- 
ment varies from one experimental situation to another, and (3) 


* The symbol J was originally used to represent a postulate (S 
regarded as derivable from other ponuUtes of the set. However* 
the symbol is retained to represent the same general function of 
receiving the incentive (K'). 


p. 178), but it is now 
even though derived, 
the effects of delay in 
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the organism does nothing in particular through chain reactions or 
otherwise during the delay. It is this non-chaining case of the gradient 
of reinforcement which we shall now consider in some detail. 

As pointed out in the first part of this chapter, both Sd and the 
stimulus trace presumably become attached to tq. We shall accord- 
ingly take up the stimulus-trace factor alone in this section. If 
later Sd is proved experimentally to have this sort of power to an 
appreciable extent, the theoretical consideration of the two may 
then be combined (+). 

Since the detailed quantitative theory of the delay in reinforce- 
ment of single responses is both very new and very complex, we 
proceed to its elaboration with considerable uncertainty. The im- 
portance of opening the problem to the consideration of behavior 
scientists, however, is great enough to warrant the effort even at 
the risk of making some errors. Incidentally it is encouraging to 
observe that a few writers (19', 20’, 16) have already suggested the 
general idea which we shall advance. Spence first proposed the 
basic notion that the fractional antedating goal reaction would 
generalize on the continuum of the stimulus trace. From this the 
present theory has been developed. First we shall sketch the theory 
in a preliminary qualitative manner and then proceed to a detailed 
quantitative derivation. 

Let it be supposed that a hungry albino rat is presented at hourly 
intervals with a response-bar in a Skinner-box situation, and that 
each time the animal makes a bar-pressing response (R) to the 
smell of food on the bar a delay of 4" elapses before it receives the 
food reward (K'). Simultaneously with the animaFs reception of 
the food the goal reaction of eating (ro) occurs and becomes pri- 
marily reinforced to the stimulus trace left by the apparatus and 
response stimuli. Tins trace appears after a brief recruitment phase 
of a short half second, and then proceeds abruptly into a relatively 
protracted subsident phase during which it decays to zero. In A” 
it will diminish to a comparatively small value. Our working 
hypothesis is that the strength of the associated ro is a positive 
function of the magnitude of the stimulus trace to which it is 
attached. This is one factor in weakening the response (R) when 
the reward is delayed. Each time the apparatus is presented to the 
subject after its reinforcement to ro the trace will tend to evoke the 
To throughout its earlier or stronger sections by stimulus-trace 
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generalization, i.e., in dose temporal proximity to R. This antc- 
Lting evocation of r„ naturally produces an antedating So. No« 
since So is intimately associated srilh primary reinforcement, c.g., 
eating, it will acquire the power of secondary reinforcement (xvi). 
But the greater the delay in reinforcement, the wcaher will be the 
generalized ro, the weaker the So, and consequently the weaker the 
secondary reinforcement betss-cen S, the apparatus stimulus, and R, 
the response. Presumably the generalization is based on iHk and 
the response, ra, is evoked by the junction of iHr and the 1) \%hich 
is always present. Tins is a rough qualitative outline of the thcorj’ 
of the present gradient of reinforcement of a response as a function 
of its delay. 

Having before us the qualitative su\«tancc of the single-link 
gradient of reinforcement, \vc may now proceed to a more detailed 
development of the theory, this time in a numerical quantitative 
manner. Perhaps the best way to do this is to present at once tlic 
basic assumed equation, 

J = .Er, = D X K X V, X .Hr X icr X V, (9) 

D = 5.17H2 
K .98745 
Vj « .95214 
.Hr « ,97 
d = 2.81034 
Vi = .17482. 

Now according to the present hypothesis two fairly distinct reaction 
potentials are involved here; (1) that of the goal reaction 
and (2) that of the instrumental or bar-pressing reaction (.Ek); 
p, K, and H are common to both reactions and function in botli 
jointly to produce the J result. As an explanatory detail it may be 
stated that V, is supposed to operate only on the goal reaction 
potential generalizing on the stimulus trace, and the V, is 
suppoKd operate only in the production of the final or instru- 
mental act (.Eh) of pressing the bar. though this makes no dHTer- 

The ‘ “f equation 9 is multiplicative. 

The detailed theoretical computations of the entire graLnt are 
shown m Table 12. In order that the reader may unSrstand this 
table we shall now trace in some detail the computational process 


where 
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where the delay in reinforcement is assumed to have been 4” as 

already suggested. _ 

One of the most complicated computations concerng the general- 
ization of .E,o along the stimulus trace. In the first place there are 
involved Mo stimulus traces; (1) that of the apparatus stimuli and 
(2) that of the proprioception of the response (R). The bar-pressing 
response must always follow the presentation of the apparatus 
s^\ijnili by varying amounts of time, since R occurs more or less by 
chance. This means that the two stimulus traces will differ in 
strength by varying amounts on dificrent occasions. Since we know 
very little quantitatively about stimulus-trace generalization we 
shall consider only one trace, or a kind of composite of the two. 
Both traces presumably reach their maximum strength a short 
half second after their respecdve physical stimulations. It follows 
that at 4" after R occurs this trace is only about 3.5" from its 
maximum, and the trace of the visual apparatus stimuli will have 
varying amounts of strength more than 3.5" from its maximum. 
We shall accordingly assume that the maximum strength of the two 
traces will fall on the average at about the time R occurs, or 4" 
from the time of the receipt of the incentive (K'). 

The first time the food reinforcement occurs the apparatus trace 
will have fallen from 1000.0 units (Table 12, line 2, column 1) to 
1.5476 units (Table 12, line 2, column 9). This means that the 
reaction potential thus set up will be between the goal reaction 
(ro) of eating and the greatly enfeebled stimulus-trace element. On 
the next trial (one hour later) we may expect a new event to occur, 
namely, the generalization of the habit function involved. Accord- 
ing to equation 21 this may be expressed as, 


8iH, = (21) 

'*„7 ?”• ®y Table 12. line 3. log Sj = 3.00000 

and log Si — .18966. Accordingly 


d = 3.00000 - .18966 = 2.81034. 
tte function ^ we have 

10-.io<nMM< ^ .37884^ 

which appears in line 5, column 9. 
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At this point we introduce Vi, which was computed from S' (at 
4" delay, column 9) by means of equation 6. This yields .17482 as 
shown in row 6, column 9. It is assumed that is dependent on 
the stimulus-intensity dynamism of the trace to which it is rein- 
forced. TWs accounts for the presence of Vi in equation 9.* 

Finally, substituting in equation 9 the computed results sho^vn 
in Table 12, we have 

J = bEhj = 5.17142 X .98745 X .95214 X .97 X .37884 

X .17482 = .31235, (47) 

which is the theoretical amount of the reaction potential under 
conditions of delay. This is the value we have been seeking. In 



FIGURE 34. Graphical representation of a theoretical gradient as a function (J) of 
he delay in reinforcement. Plotted from the final line of values in Table 12. 

Table 12 it appears in row 7, column 9. The remaining values in 
this table were calculated in an exactly analogous manner. 

We secure a representation of the theoretical gradient of rein- 
forcement as a function of the several delays, given in the first 
values of the ten columns in Tabic 12, by plotting the final values 
of the same columns. This is shown as Figure 34. It is particularly 

* In case it is decided that V| does not belong in equation 9, the generalization of 
bHr alone (row 5) will produce a dcIay-«f-rdnforceraent gradient which agrees very 
well with empirical fact even though it may not be quite the same shape as Figure 34. 
One reason for assuming that gHa has a chvacterutic such as gEa is that it presumably 
comes from the act ro, which itself involves its own reaction potential. 
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to be observed that each of these final delay values represents a 
determination with a distinct group of comparable subjects. This 
must be sharply distinguished from the situation considered in 
Corollary iii A in that the latter involves the dynamics of behavior 
chains (Chapter 6) in which each subject or each trial undergoes 
all the delays of the experiment; in the case involved here the 
subject supposedly does little or nothing but wait for food during a 
specific period of delay. 

These and related considerations bring us to our next theorem, 
which, because of the great amount of use which it will serve, we 
shall call a corollary. (This also was given in Chapter 1.) 


CoTollary iii 

B. The greater the delay in the receipt of the incentive by groups of 
learning subjects, the weaker will be the resulting learned reaction 
potentials (bEr^), the shape of the gradient as a function of the respec~ 
tive delays being roughly that of decay with the lower limit of the extended 
gradient passing beneath the reaction threshold, i.e,, 


J = bEr, * D X K X Vj X bHr X 10- »» X Vx (9) 

where 

d = log S' of Vs - log S' of Vi. 


It happens that a great deal of excellent experimental work is 
available bearing on the validity of Corollaries iii A and iii B. 
The information regarding iii A comes from behavior chains 
(Chapter 6) This evidence suggests strongly that the chain 
gradients of delay in reinforcement presumably fall less as a maxi- 
mmtl and are of much gentler slope than are those generated by 
iea? V ^-7 T' ^ accordingly has ample empir- 

ani&ice 7,7' '"7 P-kins (IS), 

and Griee (3), among others, show the limiting effect on the find 

reaction potential of the delay in reinforcement and the general 

dro’’hr, *at Corollary iii B 

also has received substantial empirical verification 

the « B del^'7' 7 - detail. Most of 

Figure 34; this is particularly noticeable at the 1 ' shown in 

...d. 
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of Pcrin’s curves, however, has a shape much like Figure 34 (8^ 
p. 139). 

Recent experimental studies, especially several emanating from 
the University of Iowa laboratory, show that as greater care is 
taken to avoid secondary reinforcement from the apparatus situa- 
tion the fall in the gradient becomes progressively more rapid; i.e., 
the length of the gradient is related to the amount of reinforcement 
which is present. For example, Wolfe’s gradient (1934) was reported 
in terms of minutes, whereas the curves reported by Perin and by 
Spence’s students (1946-1950) were in terms of a few seconds only 
as is the gradient shown in Figure 34. In this respect, also, the 
deduction is, accordingly, supported by recent experimentation. 

As may easily be seen, there are in the foregoing situation (iii B) 
numerous postulated factors and many uncertainties yet to be 
clarified jointly by experiment and by systematic interpretation 
of theory; from which we conclude that these deductions must be 
considered no more than first approximations. Since the amount 
of bEr (according to equation 9 as worked out in Table 12) involves 
the product of six different values, the detailed relationship is left 
uncertain. The author’s previous writings on this subject suggest 
that the asymptote of each learning curve is limited by the amount 
of delay, and follows a course approximately according to that 
shown in Figure 34. On the other hand, certain investigators (e.g., 
Seward, 76) seem to attribute the delay in reinforcement to motiva- 
tion. Problems of this type are exceedingly difficult to solve experi- 
mentally because bHr presumably can never be observed behavior- 
ally apart from bEr, D, K, and the rest. Actually this may mean that 
it is impossible to determine just how each of the five values repre- 
sented in equation 9 is related to the gradient of reaction potential 
as a function of the delay in reinforcement, because basically the 
question may turn out to be a false one. 

The Realization of an Anticipation end Its Frustration 
When To So leads to So Ra, i.e., when the anticipation of food 
leads to the actual eating of the food, we have what we shall call 
the realization of an anticipation. It is evident that under these circum- 
stances the strength of the connection between the trace and ro, 
and that between So and the preliminary movements leading to 
Ro, will be further increased. 
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But what will happen if, after the habit is established, the antici- 
pation is not realized; if the anticipated food is not found as 
previously? Clearly, the absence of So, the major stimulus com- 
ponent, will eventually terminate the goal behavior formerly 
evoked by So, and the acts involved in an effort to realize the 
anticipation will generate la, which will tend to inhibit them. 
This termination is what is known as ixperimental extinction. But 
before the extinction may be considered complete, generalization 
from somewhat similar situations which have been reinforced 
after a little delay will produce a considerably stronger series of 
the subgoal reaction with wider variations, such as those involved 
in searching behavior. These more aggressive reactions will be 
accompanied by internal (emotional) secretions, but in the end 
they will receive no reinforcement and the inhibitions conditioned 
to them will accumulate in amounts greater than the reaction 
potential; the extinction may then be said to be complete. Since 
the process of conditioning the inhibition to So is one of learning, the 
resulting quantitative outcome of experimental extinction is shown 
substantially as an ordinary learning curve, sometimes with a little 
initial rise in *Er due to emotion (D). 

Generalizing on the preceding considerations, we arrive at our 
next major conclusion; 


Corollary xvii 

On the abrupt assation of tht customary rcinforcimmt oj a preuiausly 
learned M, the repeated presentation of the eooking stimuli rjuill for a 
I'me (7) eontmue to evoke the act, (2) sometimes with at first a slight 

potential, which (4) is the reverse of a positive learning curve? 

in^'c''ommofs'. extinction has been known 

an nn,ntal ,s s,,.enta.ica„y and unifonniy rewarded n^^^tun:- 

tr.raliion.1 .1.55 in which bch.vioc ihS^ »>' 



FRACTIONAL ANTEDATING COAL REACTIONS 135 

bcr of unrewarded aets is gradually inereased, it may be taught to 
do a great deal of work for quite a small reward. In this ease the 
stimulus of repeated failure leads to an anticipated reward (ro) 
which counteracts to a large extent the Influence of the accumulat- 
ing lit. The point is that unless a very great deal of work (W) is 
involved the bIe appears not completely to counteract the effect 
of the ro which accompanies each non-reinforced response, because 



FIOURE 35. Graphic record of a rat trained to work on the basis of various ratios of 
unreinforced to reinforced acts. Plotted from Skinner’s published data. Reproduced 
from KcUcr and Schoenfeld (9, p. 95), 

the traces of these responses are always reinforced in a major way 
at the end of the sequence. Apparently the reason experimental 
extinction takes place so easily when non-reinforcement first 
begins to occur consistently is that ro has not been given a chance 
to become attached to the traces of stimuli produced by non- 
reinforcement. This is of course a sound biological economy. 

Generalizing on the preceding considerations, we arrive at our 
next theorem: 

THEOREM 27. In case an animal is taught an act through a given 
reinforcement and then is given a gradually increasing series of non- 
reinforced massed response evocations always followed at once by 
primary reinforcement.^ the ro will become attached to the traces of these 
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non-remjommmls in such a way as to reinjmce them, largely neutral- 
izing the I. accumulating and thus permitting very long pnmarity 
unreinforced behavior series to occur. 

Experimental work bearing on the validity of Theorem 27 has 
been reported, especially by Skinner {IS). He states that a single rat 
was gradually shifted from learning to give a series of eight unrem- 
forced responses before receiving one reinforcement up to a maxi- 
mum ratio of 192 unreinforced responses to one reinforcement! A 
graphic record of this animal’s behavior is shown in Figure 35. The- 
orem 27 appears to have found a convincing empirical verification. 

It should be observed that the traces from the various responses 
become reinforced to Rq at the end of a fixed number of acts. As 
pointed out above (Corollary iii A), this gradient of generalization 
slopes downward from the reinforcement to the beginning as a 
concave-upward slope. 

From the preceding considerations, we arrive at our next 
theorem; 

THEOREM 28. When an organism has become accustomed to making 
a number of unreinforced responses to one reinforcement in a fixed 
ratio continuously, the responses will start out relatively slowly and 
will increase in rapidity with a positive acceleration, reaching a maxi- 
mum at the point of reinforcement. 

It happens that the Skinner data plotted as Figure 35 also 
illustrates this theorem very nicely. Each of the three records pre- 
sented consists of a series of these gradient-of-reinforcement curves 
between the successive horizontal lines. Thus Theorem 28 appears 
to find empirical verification. 


Two Double-Drive Learning Situations 

There is much reason to believe that the internal drive stimulus 
m hunger differs from that in thirst. For example, in the Hull- 
Leeper experiments (7; 72) albino rats learned after varying 
"8"'’ foul'd a inaz" 

rectangle when hungry (S.,,) but not thirsty, and to turn the 
opposite way (R„) around the rectangle when thirsty (S».) but not 

hungry. The first situation may be renresenteH f,- i a r 
for this purpose by, represented fairly adequately 


Sd^ -*» Rr, 
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and the second by, 

Sdi Rtt. 

The point is that each stimulus is associated with the evocation 
of a different turning response, and since only one of the stimuli 
was present at a time there should be a tendency for the associated 
response only to occur on any given test occasion. No doubt the 
antedating goal reaction enters into the situation but this exposi- 
tory complexity need not be considered at this time. (See p. 139.) 

The above symbolic representation of the presumptive partial 
dynamics of the situation serves fairly well to explain the phenom- 
ena resulting from this particular experimental arrangement, but 
it would be an easy matter to set up an experiment which could 
not be explained in this simple manner. For example, let it be 
assumed that an albino rat when both hungry and thirsty is placed 
in a simple T-maze. If the animal turns right (Rst) to that arm of 
the maze, it finds food and eats; if it turns left (Ru) into that arm 
of the maze, it finds water and drinks. By analogy of the preceding 
analysis, Sdi, and So* will both be attached to Rnt and Ru. So far as 
this mechanism alone goes, when the animal later enters the maze 
hungry but not thirsty the Sd^ alone will tend to evoke both Rri 
and Rli. And similarly, when the animal is thirsty but not hungry 
the Sd, alone will tend to evoke both Rr, and Ru. The result will be; 



i.e., on this assumption alone both stimuli will evoke both responses. 
In other words, the preceding type of analysis furnishes no presump- 
tion that the hungry organism will tend to choose the right arm of 
the T more frequently than the left arm, or that the thirsty organ- 
ism will tend to choose the left arm more frequently than the 
right arm. 

However, Spence, Bcrgmann, and Lippitt (20) have made a 
rather different theoretical analysis of this experimental situation, 
which seems to avoid the above difficulties. They suggest that the 
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fractional antedating goal reaction enters the situation, changing, 
its dynamics quite radically: 

1. The organism, both hungry and thirsty, turns into the right 
arm of the T, finds food there, and cats it. 

2. As a result of this the stimulus traces of these events, notably 
of looking right at choice point x of the T(s»r) and beyond, will be 
reinforced to the eating or goal respomes (ro,), i.e., 

S.a -•» To,. 

3. On subsequent occasions the fractional antedating goal re- 
sponse (ro.) will be drawn forward to 8,^ and earlier. 

4. Because of tins the combination of the trace of (i.e., Sxr 
looking to the right) and so, (eating) will always be reinforced to 
the right-turning response (Rai), i.e., 


So. 


Rri 


5. Similarly the trace of s.^ (looking to the left at choice point x) 
reinforced to the left-turning response 


s„J 

*= organism, S., and S,,. have 

rccfto^H^"" "'u ' has been 

accustomed to eat when hungry and drink when thirsty. 

;rnuaf£“7r” 

> S;,I!'rof ‘his experiment, S„,-.r„. 

experiment S„, and''s!°hwe l^n 

the effects of L previoi^ -"qt'uiS'^^^rsu ■“ 

10. Therefore when on thereat atnoTgivl: “‘“‘■ 

S».~.r„, >S„,-,r„„ 
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i.e., under these test conditions, 

So, > Soj. 

11. It follows from 4, 10, and Postulate VI that 

s.. ] S,, \ 

Sd^ > ”» Rr* > Sdji, I —* Rtt, 

So, j So, j 

i.e., under these conditions Rat > Rti. 

12. Similarly, it follows from 5 and similar reasoning that, 

S*L \ 1 

So, ? —* Ru > Sd, > *-♦ Rai, 

Sqj j Soj } 

i.e., under these conditions Rm > Rat. 

1 3. From 1 1 and 1 2 it follows that the organism, after such train- 
ing and when subject to only one need (within the limits of the 
oscillation function), will tend to go directly to the arm of the maze 
where it has been accustomed to receive temporary release from 
the cravings it has at the time and so receive reinforcement. 

Generalizing on the preceding considerations, we arrive at our 
next theorem: 

T H E o R E M 29. ^ an organism has two drives operating and is repeat- 
edly in a situation where one or the other drive stimulus may be rein- 
forced by a distinct series of movements^ it will later, when only one 
of the drives is operating, at once tend to perform the movements which 
formerly led to the reduction of the So in question. 

There is a single experiment bearing on the theoretical deduction 
just given, that of Kendler (fO). As a matter of fact, the present 
author used this experiment as a kind of target at which the theory 
was aimed. Actually the theoretical conclusion agrees very well 
with Kendler’s empirical findings. He reports that on the test under 
each of the single drives his animals turned in the direction of the 
appropriate reinforcing agent a statistically reliable number of 
times. 

And now, after the preceding discussion of the second double- 
drive problem, we may return to the first (Hull-Lcepcr) problem 
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with our new analysis and a more adequate, though not a simpler, 
statement of the theoretical outcome. On the analogy of the inequal- 
ity shown in step #11 of the preceding deduction, we can now state 
for the first double-drive problem: 

i.e., in respect to bEr under these circumstances. 

R-Rt > Ru. 


SOd ) So J 


And on the analogy of the inequality #12 of the preceding deduction 
we may state: 

Sd. 

SOd ) Soj 

i.e., in respect to bEr under these conditions, 

Ru > Rri. 

The fact that both Hull’s and Lecper’s rats learned to discrimi- 
nate hunger from thirst stimuli amply substantiates Theorem 29. 
Also it is likely that the more rapid learning by Leeper’s technique 
was due to the reinforcing effect of ro So which presumably was 
evoked m a mild form by the animals’ seeing the food when not 
hungry and seeing the water when not thirsty. This leads us to the 
consideration of latent learning. 

latent Leorning in Theoretical Perspective 

ftandTna 'Tr 'T'" “"der- 

learS BuTrfl ™ P^nomena concerned with latent 
Safa be weii to 

general question of manfest and latent 

take for expository purposes an P°'“da!. Let us 

8 (Postulate X), simpler form of equation 

»E« = DXKX.H.. ( 48 ) 

fn’’’’- mm' «) D = 0 K = 9 and 

•H.- .6126. Substituting these values, we have; 



FRACTIONAL ANTEDATING GOAL REACTIONS U1 

The point is that the considerable learning represented by bHe 
= .6126 would not be evident in the behavior of the organism as 
sEr, i.e., this sHr would be latent. On the other hand, if D 3.105 
we should have: 

sEr = 3.105 X .9 = .6126 
= 1.7119, 

which is an appreciable reaction potential. Now the influence of 
the bHr has become manifest as contrasted with its previous latency. 

Or suppose we have the D and bHr values as just assumed but 
K = 0, i.e., 

bEr = 3.105 X 0 = .6126 

= 0 . 

Once again, but for a quite different reason, the bHr becomes latent, 
though when the K = .9 is restored it becomes manifest as in the 
computation presented just above. 

The early experimental studies of latent learning tended to con- 
sider the modifications in learning curves caused by an abrupt 
shift from a relatively slight reward, what we would now call a 
secondary reinforcement, to a fairly large reward. As just shown 
it is an easy matter theoretically to represent these shifts of both 
the drive (D) and the incentive (K) functions. Because of their 
more central significance in the present historical problem, we 
shall here consider only shifts from smaller to larger incentives, 
rewards, or reinforcing (K') agents and the reverse. This is a com- 
plex process and must be elaborated. 

In connection with the shift in the quantity of the incentive 
during or at the completion of the learning we note that there are 
at least three processes involved- The first is implicit in Postulate 
VII. In this process the permanent response intensity varies with 
the magnitude of the incentive according to equation 7. The second 
process may be called the Crespi effect^ because it ^vas Crespi who 
first clearly demonstrated its existence and general nature (2). 
Through this effect, when the incentive changes from that opera- 
tive during the particular learning, the corresponding response 
intensity (or latency) not only shifts upward or do\vnward as 
implied in equation 7, but both sorts of response shifts are in 
excess of the permanent response intensities called for by equation 7. 
The third effect associated with shifts in incentive is that the 
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permanent response change itself is a rather rapid asymptotic 
learning process requiring about four trials for its approximate 
completion. These points are shown with admirable clearness by 
Crespi’s graphs which arc reproduced as Figure 36. Unfortunately 
there are two matters which Crespi’s experiment does not clarify: 
the questions of (1 ) whether the excess shift effect is temporary or 



supplementary 

e ® potuL that tZ 

We assume that tWs leTn*ngX'th^sUive"Md'''^*^“? temporary, 
rate of 80 per cent on negative, is at the 

maze learning is here assorted lo ’he'^ ftT, ‘I'" 

eaeh trial. This means that with TOntL^H 7 1 

shift effect will soon reverse itself r ? ^ «™ulation the excess 

normal effect of equation 7 Allofth Presumably 

subject to empirical ScauW " 
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Because of the foregoing considerations and the fact that Crespi’s 
effect seems not to appear in maze learning, we shall now ignore 
this matter completely and proceed to the deduction of the perma- 
nent transition of the incentive upward at N = 10. The value of 
the lower incentive (K) is taken as .6, whereas that of the upper is 
taken as .9, the difference being .3. Multiplying this difference by 
.8 per trial, the new incentive factor on increase, we have .24. 
This added to the .6 previously used amounts to .6 -f -24 = .84. 
In addition, the value of sHn has shifted to .6862 at N = 11. It 
will be recalled that here the value of D is 3.105. Substituting all 
these values in our modified sEr equation 48, we have after the 
tenth trial and before the eleventh trial, 

bEr = 3.105 X .84 X .6862 X 1.789. 

Similarly, the next bHr = .7176 and the new K obtained by an 
analogous procedure amounts to .888. Substituting in equation 48 
again we have, after the eleventh trial and before the twelfth trial, 

bEr = 3.105 X .888 X .7176 = 1.977, 

and after the twelfth trial but before the thirteenth, 

bEr = 3.105 X .8976 X .7458 = 2.077. 

This value of 2.077 deviates only .0004cr from the value which 
would have resulted had an incentive value of .9 been used from 
the beginning, and is far too small a difference to be detected by 
empirical methods now available. Exactly analogous computations 
were made for a second shift of the incentive upward from .6 to .9 
at N = 20, and with exactly analogous results. The theoretical 
results of these shifts are represented graphically in Figure 37. 

The procedure in the case of the downward shifting of incentive, 
while exactly analogous, differs in some of its details because the 
learning in this case consists in a progressive decrease in the K value 
from .9 to .6. The difference in K as before is .9 — .6 — .3, and 
.3 X .8 = .24. This is subtracted from .9, i.e., .9 — .24 = .66. The 
value of bHr at N = 16 is .8147, while D is the same as before. 
Substituting in equation 48, we have: 

bEr = 3.105 X .66 X .8147 = 1.6696. 
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Similarly, where N = 17, 

bEr = 3.105 X .612 X .8332 = 1.5834; 
where N = 18, 

bEr = 3.105 X .6024 X .8499 X 1.5897; 
and where N = 19, 

bEr = 3.105 X .60048 X .8649 = 1.6126. 

In this series of computations, despite the fact that the value of 
the K component progressively decreases, the sEr values at N = 18 
and 19 progressively increase because of the increase in N and so 
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Generalizing on these considerations, we arrive at our next 
theorem: 

THEOREM 30. Other things constant^ an abrupt shift in the incentive 
used during a maze-learning process will be followed first by a major 
shift in reaction potential and then by two or more progressively smaller 
shifts on successive trials, the series constituting a rapid learning 
process of the exponential variety, culminating in the course that the 
bEr would have followed had the new incentive been operating con- 
tinuously from the beginning of the learning. 

We are fortunate in having several excellent empirical investiga- 
tions bearing on the soundness of the present corollary. In a classical 
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FIGURE 38. Graphic representation of the empirical effect of a shift in the incentive 
upv/ard in terms of blind-alley entrances made on the maze shown in Figure 67. Here 
the increased incentive was added at the arrow. The curves HR and HNR arc from 
control experiments, here inserted for purposes of comparison. Reproduced from Tel- 
man and Honzik {23, p. 267). 

Study by Tolman and Honzik { 23 ), two groups of 41 rats each were 
trained on the maze shown in Figure 67. One group received no 
reward in the food-box during the first ten days; they were retained 
there for two minutes and then returned to their living cages where 
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Similarly, where N = 17, 

,E, = 3.105 X .612 X .8332 = 1.5834; 
where N = 18, 

bEii - 3.105 X .6024 X .8499 X 1.5897; 
and where N = 19, 

bEr = 3.105 X .60048 X .8649 = 1.6126, 

In this series of computations, despite the fact that the value of 
the K component progressively decreases, the bEr values at N = 18 
and 19 progressively increase because of the increase in N and so 



nfT «=ulting from the shift 

this value IS, of course, far smaller than could be secured by ordinan 
pperimental procedures or graphical representation The preced 
mg computations are shown graphically Figure 37. 
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Figure 38 are included here also for purposes of comparison. 
Beginning with the withdrawal of the stronger incentive, a marked 
rise may be seen in the number of errors; about two days were 
required for this to reach the upper level. Moreover, in the original 
study as reported there are two parallel latency graphs both of 
which show at least three separate subordinate shifts tending, upon 
the whole, to decrease progressively in magnitude. 

Of course the above facts have been well known for a consider- 
able time, so that Theorem 30 is not a prediction but rather an 
explanation and a formulation. At all events, the deduction agrees 
substantially with empirical facts. Incidentally these facts were 
originally put forward by Tolman (22) as the major evidence for 
his concept of latent learning. Because the Tolman-Honzik data 
do not clearly show the Crespi incentive-shift effects (2, p. 508), 
we have made no attempt to incorporate these phenomena into the 
present formulation. 

The reason why latent learning has attracted so much attention 
is not that it is of any obvious practical importance, at least at the 
present time, but because of its theoretical significance. This will 
be evident from a quotation from Tolman (22, pp. 363-364) : 

But, as we saw at the beginning of the chapter, the Law of 
Effect also does not hold. The latent learning experiments 
indicate very definitely that just as much learning . . . goes 
on without differential effects, or, at the most, with only very minor 
degrees of such ejects, as with strongly differential ones. . . . 
Differential effects are, that is, necessary for selective perform- 
ance but are not necessary, or at the most in only a very minor 
degree, for the mere learning . . . which underlies such per- 
formance. [Italics added.] 

The italicized portions of this quotation show that at the time 
it was written little distinction was made between a zero incentive 
or effect and a minor degree of incentive or effect. On the other 
hand, recent studies have shown that secondary reinforcement 
seems to have fairly strong reinforcing effects. For example, second- 
ary reinforcement probably was the factor which caused the fall 
in the hunger-no-reward (HNR) curve shown in Figure 38. As a 
matter of fact its fall was almost half as great as that of the hunger- 
reward (HR) shown in the same figure. Such a large and con- 
sistent effect is not accidental and clearly must be reckoned with. 
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after three hours they were given their regular On the 

eleventh day and thereafter, this group re«.ved food «ward m the 
end box. The behavior of these animals is shown in the HNK.-K 
curve of Figure 38, together with curves from tivo control groups. 
With the addition of increased incentive the experimental group 
at once began to reduce its error scores, reaching the P=™anOTt 
level in about three days. A few years earlier Blodgett (7) had 
repotted an experiment similar to this, in which he secured analo- 
gous results. 



FIGURE 39. Graphic reprcscDtadon of the empiricai effect of a shift in the incentive 
downward in terms of the number of blind-alley entrances made in the maze shown in 
Figure 67. The weaker incentive was substituted at the arrow. The curves of HR and 
HNR correspond to the two empirical curves shown in Figure 38, and are inserted in 
this figure as controls for purposes of comparison. Reproduced from Tolman and 
Honzik, {23, p. 262). 


The animals of Tolman and Honzik’s other group were trained 
with reward in the food-box during the first ten days, and there- 
after with no reward; instead they were retained in the food-box 
for two minutes and then returned to the living cage where, after 
three hours, they were given their regular feeding. The mean 
behavior of this second group is shown in the HR-NR curve of 
Figure 39; the same control group curves as those appearing in 
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Figure 38 are included here also for purposes of comparison. 
Beginning with the withdrawal of the stronger incentive, a marked 
rise may be seen in the number of errors; about two days were 
required for this to reach the upper level. Moreover, in the original 
study as reported there are two parallel latency graphs both of 
which show at least three separate subordinate shifts tending, upon 
the whole, to decrease progressively in magnitude. 

Of course the above facts have been well known for a consider- 
able time, so that Theorem 30 is not a prediction but rather an 
explanation and a formulation. At all events, the deduction agrees 
substantially with empirical facts. Incidentally these facts were 
originally put forward by Tolman (22) as the major evidence for 
his concept of latent learning. Because the Tolman-Honzik data 
do not clearly show the Crespi incentive-shift effects (2, p, 508), 
we have made no attempt to incorporate these phenomena into the 
present formulation. 

The reason why latent learning has attracted so much attention 
is not that it is of any obvious practical importance, at least at the 
present time, but because of its theoretical significance. This will 
be evident from a quotation from Tolman (22, pp. 363-364): 

But, as we saw at the beginning of the chapter, the Law of 
Effect also does not hold. The latent learning experiments 
indicate very definitely that just as much learning . . . goes 
on without differential effects, or, at the most^ with only veryminor 
degrees of such effects, as with strongly differential ones. . . . 
Differential effects are, that is, necessary for selective perform- 
ance but are not necessary, or at the most in only a very minor 
degree, for the mere learning . . . which underlies such per- 
formance. [Italics added.] 

The italicized portions of this quotation show that at the time 
it was written little distinction was made between a zero incentive 
or effect and a minor degree of incentive or effect. On the other 
hand, recent studies have shown that secondary reinforcement 
seems to have fairly strong reinforcing effects. For example, second- 
ary reinforcement probably was the factor which caused the fall 
in the hunger-no-reward (HNR) curve shown in Figure 38. As a 
matter of fact its fall was almost half as great as that of the hunger- 
reward (HR) shown in the same figure. Such a large and con- 
sistent effect is not accidental and clearly must be reckoned with. 



A BEHAVIOR SYSTEM 

,nrl hox The behavior of these animals is shown in the Hi 
curve of Figure 38, together with curves from two control 
With the addition of increased incentive the ^ 

at once began to reduce its error scores, *= P"™ h,d 

level in about three days. A few years earlier Blodgett (7) ha 
reported an experiment similar to this, in which he secured analo- 
gous results. 



FIGURE 39. Graphic rcpresentalion of the empirical effect of a shift in the incentive 
downward in terms of the number of blind'alley entrances made in the maze shown in 
Figure 67. The weaker incentive was substituted at the arrow. The curves of HR and 
HNR correspond to the two empirical curves shown in Figure 38, and are inserted in 
this figure as controls for purposes of comparison. Reproduced from Tolman and 
Honzik, {23, p. 262). 


The animals of Tolman and Honzik’s other group were trained 
with reward in the food-box during the first ten days, and there- 
after with no reward; instead they were retained in the food-box 
for two minutes and then returned to the living cage where, after 
three hours, they were given their regular feeding. The mean 
behavior of this second group is shown in the HR-NR curve of 
Figure 39; the same control group curves as those appearing in 
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2. In an exactly similar manner the left turn to the water arm 
of the maze would yield a slight tendency to 

S-L 

Si 

Saj 

3. Now as pointed out above, organisms usually eat when hungry 
and drink when thirsty. Therefore, 

Sob ro, > Sob "* roj. 

4. Under hunger conditions Soj (#3), 

So, > S04. 

5. By logic similar to #3 and it might be shown that under 
thirst conditions (Sd,), 

S04 > So,. 

6. Therefore steps #1, #3, and #4 would yield the following 
theoretical inequality: 

Sxjl 

Soh Rri > Sdj, Rtt. 

So, So, . 

7. And from steps §2 and we have the theoretical inequality: 

S<L ] \ 

Sd, ? Rl, > Sd, I ”* Rat. 

S04 J Soj / 

From the preceding considerations we arrive at our next theorem: 

THEOREM 31. 1/ an organism operates on a T'-maze when satiated 
with both food and watery consistently finding food at the end of one 
arm of the T and water at the end of the other arm of the T, this trains 
ing will so reinforce the responses of turning info the respective arms of 
the maze to the visual and related traces of looking into those arms 
that later when under the food drive or water drive only the organism 
will have a slight tendency to choose the appropriate arm of the T. 

During the last ten years much careful experimental work per- 
formed exclusively with albino rats has been devoted to the matter 
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Moreover there is strong reason to believe that a great difTercncc 
exists between a very small K'. i.e., a little reinforcement, and no 
reinforcement at all. By the computations shotvn below it appears 
that if a habit amounts to .6126 and an incentive (K) amounts to 
.01, this would bring out the reaction potential at 

,E» = 3.109 X .01 X .6126 
= .1902. 

But when the K rises to .9 the «Eb 'vill rise in a few trials to 1.7119, 
as we saw above (p. 141). 

Current Aspects of the Latent Learning Problem 

In the thirty years or so which have passed since the original 
experiment on latent learning was performed a great deal of 
scientific work has taken place in this difficult field. The matters 
of secondary (static) reinforcement from the apparatus situation 
and the various drive motivations have been as carefully controlled 
as possible, so that in case presumably latent learning occurs the 
variable factor producing it may be identified. During this time 
those who arc developing the reinforccment»theorctical approach 
have become interested in the possibility of deriving latent learning 
by way of the fractional antedating goal reaction. We propose to 
examine this possibility now. 

Let it be supposed that an organism satiated with both food and 
water but motivated by a mild third incentive, such as a cage mate, 
is repeatedly run through a T-maze with food at the end of the 
right arm of the maze and water at the end of the left arm, in 
addition to a cage mate in each place. After an equal number of 
these mildly socially rewarded trials on each arm of the T, what 
should we expect to have theoretically, in the case of the right arm 
of the T? ® 

1 . We have seen reason to believe that even when the subject is 
fully fed the sight of food will mUdly evoke ro->so, secondarily 
reinforcing (xv) a little the trace (looking toward the right) to 
Rr, even though eating does not occur; 
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not the same thing (iii A), arises within a single behavior 
though this is complicated by the type of chain involved. 

- e antedating goal reaction appears to explain the Kendler- 

ce double-drive problem, and to give a supplementary expla- 

^ ^)n of the Hull-Leeper double-drive problem. It also gives some 
" .lise of clarifying the long-standing controversy regarding latent 
' - ling and of throwing light indirectly on the still longer-standing 
• '.Ttainty regarding the molar aspect of reinforcement itself. 

^^MiNAL Notes 

,VESIGHT, FOREKNOWLEDGE, EXPECTANCY AND PURPOSE 

jfopose that a hungry organism proceeds through a maze with 
•jd at the end. The various responses, including especially the goal 
Seating response (RoJ at the end will occur as indicated in this 
apter. The fact that the fractional goal reaction (ro) occurs in an 
tedating manner at the beginning of the behavior chain or 
^quence constitutes on the part of the organism a molar foresight or fore- 
.owledge of the not-here and the not-now. It is probably roughly 
'■^uivalent to what Tolman has called “cognition” (22). 

Now this ro is behavior of peculiar significance. It does not itself 
’'reduce any change in the external world; neither does the act 
'•.self bring the organism any nearer to the food. What the act does 
^3 to produce the goal stimuli which evoke responses by the organism 
"hat tend to lead it to food, a mate, or whatever the goal or terminus 
'jf the action sequence at the time may be. In short, its function is 
strictly that of producing a critically useful stimulus in biological 
■problem solution (5, p. 515); i.e., it is a pure-stimulus act. 

When an organism begins to respond to a situation which does 
not yet exist but is impending, we say informally that the organism 
anticipates or expects the event to occur. Since time out of mind 
the ordinary man has used the words expect^ expectation^ expectancy, 
and expectative in a practically intelligent and intelligible manner. 
Around 1931, Tolman put forward the term expectation in a tech- 
nical sense as “an immanent cognitive determinant aroused by 
af lily presented stimuli” (22, p. 444). Moreover, Tolman insisted 
t’ none of his technical concepts should lend support to any sort 
.-Itimately teleological and ultimately mentalislic interpreta- 
^ animal . . . behavior” (22, p. xii). Were it not for the fact 
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of latent learning. Among the Important studies may be mentioned 
those of Spence and Lippitt (27). Spence, Bcrgmann, and Lippitt 
(20). Walker (24). and Kendler (77). While these investigators 
disagree among themselves to some extent, the impression one 
receives from studying their reports is that there probably is a \eTy 
slight positive tendency to latent learning. The weakness of this 
effect is rather surprising in view of the fact that we humans^ in 
perfoettung one task ordinarily observe things unrelated to it which 
we later recall and utilize when some other task presents the need 
for the information. No doubt this anthropomorphic analogy has 
strongly fostered this belief regarding rats. The apparent fact that 
the tendency is so strong in humans and so weak in the rat presum- 
ably means that some subvocal speech mechanism not possessed 
by the rat is primarily responsible for the difference. 

SummoT-y 

There is reason to believe that both goal and subgoal reactions 
become reinforced to stimulus traces and other persisting stimuli. 
These stimuli generalize throughout their period of persistence. 
This generalization has been especially obvious in connection with 
the stimulus trace and goal reactions. These give rise to the concept 
of fractional antedating goal reactions (r©) and the consequent 
proprioceptive stimuli, Sq. Now the goal reaction, in whatever form, 
is believed to be mildly reinforcing. It thus comes about that 
through the mediation of r® secondary reinforcement would log- 
ically occur, a matter of fact long known empirically. 

These antedating reactions apparently can be both positive and 
negative. Consequently in a molar behavioral sense they become 
foresight, or what the philosophers have called cognilion, though not 
necessarily with the speech accompaniment operative in humans. 
This negative expectancy, coupled with Ir, yields both experimental 
extinction and the possibility of learning to perform very long 
series of unreinforced acts which arc consistently reinforced at the 
ends. 

The analysis of delays in reinforcement series shows that ro 
presumably becomes reinforced to the stimulus trace at the end 
of the series and generalizes to the series beginning where it rein- 
forces the S R connections there present, yielding a gradient of 
delay in reinforcement (Hi B). Something like this gradient, but 
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probably not the same thing (iii A), arises within a single behavior 
chain, though this is complicated by the type of chain involved. 

The antedating goal reaction appears to explain the Kendler- 
Spence double-drive problem, and to give a supplementary expla- 
nation of the Hull-Leeper double-drive problem. It also gives some 
promise of clarifying the long-standing controversy regarding latent 
learning and of throwing light indirectly on the still longer-standing 
uncertainty regarding the molar aspect of reinforcement itself. 

Terminal Notes 

FORESIGHT, FOREKNOWLEDGE, EXPECTANCY AND PURPOSE 

Suppose that a hungry organism proceeds through a maze with 
food at the end. The various responses, including especially the goal 
or eating response (Ro,) at the end will occur as indicated in this 
chapter. The fact that the fractional goal reaction (ro) occurs in an 
antedating manner at the beginning of the behavior chain or 
sequence constitutes on the part of the organism a molar foresight or fore- 
knowledge of the not-here and the noUnow. It is probably roughly 
equivalent to what Tolman has called ‘‘cognition” (22). 

Now this To is behavior of peculiar significance. It does not itself 
produce any change in the external world; neither does the act 
itself bring the organism any nearer to the food. What the act does 
is to produce the goal stimuli which evoke responses by the organism 
that tend to lead it to food, a mate, or whatever the goal or terminus 
of the action sequence at the time may be. In short, its function is 
strictly that of producing a critically useful stimulus in biological 
problem solution (5, p. 515); i.e., it is a pure-stimulus act. 

When an organism begins to respond to a situation which does 
not yet exist but is impending, we say informally that the organism 
anticipates or expects the event to occur. Since time out of mind 
the ordinary man has used the words expect, expectation, expectancy, 
and expectalive in a practically intelligent and intelligible manner. 
Around 1931, Tolman put forward the term expectation in a tech- 
nical sense as “an immanent cognitive determinant aroused by 
actually presented stimuli” (22, p. 444). Moreover, Tolman insisted 
that none of his technical concepts should lend support to any sort 
of “ultimately teleological and ultimately mcntalistic interpreta- 
tion of animal . . . behavior” (22, p. xii). Were it not for the fact 
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that his writings at the time and since appear to be strongly opposed 
to an approach resembling the one here presented, we might 
suppose that the so cited above might be a concrete case of Tolman s 
imminent cognitive determining stimulus mediating the expecta- 
tion, i.e., ra -> Sq as the covert expectancy, and So -+ Ro as the 
thing expected. 

Now, human beings manifest this undoubted behavior much the 
same as do animals (4). When the incipient tendencies to ra — > So 
arise in their bodies these as stimuli may evoke verbal responses 
such as, “Dinner will soon be ready.” Presumably such verbal 
reactions, even incipient ones as symbolic, i.e., pure-stimulus, acts, 
may make great differences in the dynamics of the situation. In 
order to avoid the ambiguity of confusing two things which are 
very different, we recommend that antedating situations with 
potential speech accompaniment, as in humans, be called expectative, 
and that antedating situations in lower animals without potential 
speech be called merely anticipaiory. In that way we may help 
protect ourselves from inadvertently committing the fallacy of 
anthropomorphism and from implicitly but falsely assuming the 
dynamics of speech in animals not possessing such powers. 

Another undoubted aspect of behavior which Tolman (22, pp. 
12 ff.) has emphasized earlier, is purpose. This term has a bad meta- 
physical history but represents an undoubted aspect of mammalian 
behavior. We often know what we are about to do before we per- 
form an act, sometimes long before. There is reason to believe 
that an organism’s far antedating foreknowledge of its own goal 
and subgoal acts is mediated by subvocal speech pure-stimulus 
acts. If we define purpose as Jar antedating foreknowledge, or an 
organism’s cognition of its own acts, this would presumably limit* 
strictly purposive behavior to humans. 


A SOMEWHAT MODIFIED HYPOTHESIS AS TO THE CRITERION 
OF RE NFORCEMENT 

Upon reexamining an earlier version of Postulate 4 (5, p. 178 and 
related sections of that work, notably pages 80 and 98), it may be 
seen that there is some inconsistency in the statements. The formal 
postulate states that reinforcement is the result of the diminution 
of a need or D. On the other hand, the formulation on page 80 
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states that reinforcement is due to the reduction in Sd. For example, 
on page 80 it is stated in italics: 

Whenever an effector activity occurs in temporal contiguity 
with the afferent impulse, or the perseverative trace of such an 
impulse, resulting from the impact of a stimulus energy upon a 
receptor, and this conjunction is closely associated in time 
with the diminution in the receptor discharge characteristic 
of a need . . . there will result an increment to the tendency 
for that stimulus to evoke that reaction. 

Now ordinarily a reduction in a need implies that a reduction will 
soon follow in the drive stimulus and a reinforcement as well. 
This doubtless was the reason for the looseness of the preceding 
phraseology. In this connection it may be noticed that the present 
postulate (III) has taken the reduction in Sd rather than of the 
need or D as the essential criterion of reinforcement. 

Sheffield and Roby appear to have presented a critical case in 
point (77). They showed that hungry albino rats are reinforced by 
water sweetened by saccharine which presumably is not at all 
nourishing, i.e., it does not reduce the need in the least. It may 
very well be that the ingestion of saccharine-sweetened water 
reduces hunger tension Sd for a brief period sufficient for a mild 
reinforcement, much as the tightening of the belt is said to do in 
hungry men, thus reinforcing that act. On the other hand it may 
be that Sheffield and Roby are right in their suggestion that the 
critical factor in learning is the act of ingestion, i.e., Ro, and ro,. 
Indeed it may very well be that all the critical facts are not even 
yet fully known. A slight adaptation of the above equations should 
fit this hypothesis. 

And finally we may note the role of ro — ♦ So as a secondary rein- 
forcing agent. A judicious exploration of these possibilities is likely 
to give a rather different picture of learning from that usually held 
at present. 
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6. Simple Behavior Chains 


Simple trial-and-crror learning (Chapter 2) involves the competi- 
tion of two or more reaction potentials, such as the tendency of an 
albino rat to press a horizontal bar downward or to push a vertical 
bar to the left. The process of learning consists in the gradual 
acquisition of dominance by one of the tendencies. The end state 
in simple trial-and-error learning is thus less complex than the 
beginning in that there is only one overtly functioning reaction 
tendency, instead of several. 


Evtn from casual observation of the behavior of mammalian 
organisms, however, it is quite evident that not all learning results 
in simplification; in one way or another most learning eventuates 
m the complication of behavior. This is because such acts as the two 
just considered may be joined as links in a chain of reactions of 
greater or less length. For example, the situation may be such that 
both the downward pressure on the horiaontal bar and the lateral 
pre^ure on the vertical bar must occur, and in that order, before 
food can be secured. This is called terminal rrinfomment, because 
reinforcement occurs at the termination of the ruction chS Or 
It may be that food will be delivered after each act, provided the 
acts in question arc performed in a certain order. This rcal ed 

occurs in a series through 


Conditions under Which Simple Chaining and the lnte«rn«„« * u 
ous Reactions May Occur ° 

Consider a situation such as that set nn in o • 

Arnold (f), where a miniature c„ ruts oT a^a^l 
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window of an albino rat's restraining chamber (Figure 40 ). Let 
us assume that through previous training the rat has a reaction 
potential well above the threshold of pressing a white button or 
disk, placed on the side of the car facing the window, to secure a 
pellet of food. On this side of the car there are also placed at equal 
intervals three duplicates of the white button, making a total of 
four identical stimuli, Bi, B2, B3, and B4; B2 does not appear in the 
drawing. 



r 1 o u R B 40. Diagram showing the essential structure of Arnold’s apparatus. The four 
button manipulanda Bt, Bj, Bt, and B 4 were mounted on a car which could be drawn 
to the right by a windlass (G) operated by the motor (M), The wheels of the car are 
shown on the track at W. The rat was placed in the celluloid cylinder labeled “Box” 
and had access to the buttons through a window in the cylinder when the shutter (S) 
was lifted as shown in the drawing. When the animal pressed button Bi the car moved 
forward, exposing Bj (obscured in the drawing by the cylindrical box) to the animal. 
When this button was operated the car moved forward, exposing B», and so on to Bj. 
When Bi was operated the magnet (R) released the shutter closing the window. At the 
same time the magnet at F released ten pellets of food into the pan, P, which gave the 
animal primary reinforcement. The car is shown as if in motion, the animal having just 
pxuhed Bi, with B* moving up into position. Reproduced from Arnold ( 7 ). 

At the beginning of the learning sequence the shutter rises, 
exposing Bi. The rat sees this and presses it as he has been trained 
to do, but he receives no food. Instead, the motor at once moves the 
car forward, exposing B2. Presently the animal presses B2 and the 
car moves forwetrd, exposing Bj. Again the animal receives no food 
but at length he presses this button and again the car moves for- 
ward, exposing B^. Experimental extinction has not advanced very 
far with this rat, so that after a short delay he presses B4 also. This 
pressure automatically lowers the shutter and at the same time ten 
pellets drop into the food-pan, giving the animal primary terminal 
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reinforcement. The initial performance of a four-link be^vior 
chain has now occurred. Because the several acts of the chain arc 
substantially alike, this is called a homogeneous behavior chain. 

It is evident that the stimuli which evoke the successive action 
links of this chain come from without the organism, i.c., the stimuli 
are exteroceptive. However, the chain does not become fully 
integrated until the proprioceptive stimuli arising from the animal’s 
own muscles in the performance of one behavior link serve, at least 
in part, to evoke the succeeding behavior link, and so on to the 
end of the four-link chain. Thus further integration of the behavior 
chain is effected by the repetition of its performance with the 
terminal reinforcement of ten food pellets. Let us assume that this 
occurs once every 24 hours for 25 days. 

We may now proceed to consider some of the more obvious 
behavior principles involved in the chaining integration process 
and their characteristic behavioral implications. 

Terminal Reinforcement and the Goal Gradient 

On the first evocation of the response chain the reaction potential 
will be approximately equal to that which obtained following the 
last individual reinforcement, except for extinction effects. Let us 
suppose that this reaction potential is 2.0cr in amount. However, 
following the reinforcement at the end of chaining trial 1 there 
will begin to develop a different set of reaction potentials, those 
resulting from the goal or terminal reinforcement (delay). We 
shall assume that this gradient of reinforcement after a number of 
trials is represented roughly by the equation, 

AsEr = 10'-*'’*, (49) 

and that the four delays (t) in reinforcement at links 
I 11 III and IV 

respectively, are: 

9" 6" 3" and 0". 

Substituting these values successively in equation 49, we have the 
following reaction potentials: 


.234 .380 .617 


1 . 000 . 
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Unfortunately for the ease of understanding the quantitative 
aspects of behavioral chaining, the gradient of reinforcement suffers 
several quantitative distortions, so that to superficial observation 
it is at the end hardly recognizable. The first distortion occurs in 
the summation of the four reaction potentials with the separate 
initial flat reinforcement, which is assumed to amount to 2.0<r. 
Accordingly we combine (+) 2.0(t from the original training with 
the scries of reaction potential values, to secure the final reinforce- 
ment gradient. This gives us, respectively, 

2.16 2.25 2.41 2.67. 


Stimulus Generalization 

The next step in this quantitative modification or distortion of the 
gradient of reinforcement arises from the principle of stimulus 
generalization (72). Since all the exteroceptive stimuli for the four 
responses of the chain are as alike as they can be made, the generali- 
zation would be nearly perfect except for the influence of certain 
stimulus-trace intensities. However, these traces evidently become 
reinforced to the response with ease, and generalization presumably 
takes place jointly on the two bases. 

At this point we must recall that in respect to stimulus traces 
there are two distinct types of generalization: generalization ( 1 ) 
toward the maximum on the subsident phase of a trace, and ( 2 ) 
toward the fading termination of the trace. Moreover we recall 
(Figure 31) that from a given point on the subsident phase of the 
trace to which a response has been reinforced, generalization toward 
the trace origin tends to rise for some time rather than fall, as it 
does in simple generalization (p. 104 ff.); i.e., other things constant, 
generalization toward the maximum on the subsident phase of a 
stimulus is more intense than that toward the fading termination 
(Theorem 21 C). 

A somewhat detailed but rough representation of the various 
traces involved is given in Figure 41. Here we see that at IV the 
reaction potential to R^, as indicated by the oval, rises from a 
strong (young) trace of S 4 , from a not so strong (older) trace of sj, 
from a weaker trace of S 2 , and from a still weaker trace of Si. Now, 
Ri will tend to generalize backward (to the left) along all four of 
these traces as continua. But at III the trace from S 4 is lacking, 
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which will weaken the generalized .Ej^; at II both Sj and S 4 are 
lacking, which will weaken the generalized .En, further; and at I, 
S2, S3, and S4 are all lacking, which will weaken the generalized 
.Eii« from IV still more below the slight rise theoretically to be 
expected of a trace generalizing toward its maximum (Figure 31). 
Accordingly it is to be expected that in the stimulus chaining 
situation the gradient for the evocation of R 4 will be a falling one 
from IV to III, to II, and to I, descending in that order. 



On the other hand, R, is reinforced to s. alone at I. This trace 
' “ 'hro“gh the mere passage of time; it will weaken 

mor at 11 , and stdl more at IV. We have seen (Figure 31) that 

z^tit A "''’h t g^eral- 

to W wm alf are added from II 

to IV wtll also weaken the generalized trace progressively. 

Too many unknown principles and constants are involved in the 
generahzahon sttuauon for us to attempt a detailed deduction of 
he exact grad.ent at the present time. It is evident, for example, 
that there are present tn unknown amounts both qualitative Ld 
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intensity elements which arc believed to depend on different equa- 
tions. For present expository purposes, accordingly, we have chosen 
to use the qualitative equation of Postulate X A, with the constant 
parts of the exponent at .15 and .30 respectively for the two direc- 
tions of generalization. We take as the equation for the larger 
generalization, i.e., the less rtipid generalization fall toward 

bEh = A10-*« (50) 

and for the weaker generalization, i.e., the more rapid generaliza- 
tion fall toward IV, 

sEa = A10--*w (51) 

where A represents the value of the gradient of reinforcement at the 
origin of the generalization. These values appear in bold-faced 
type in Table 13- 

Intimately connected with these exponents is the matter of the d 
values. As the generalization difference between III and IV, we 
take 1; as that between II and III, we take 2; and as that between 

1 and II, we take 4. We have chosen this much larger difference or 
d value between I and II because the difference between the trace 
evoking response I (the shift from the living-cage to the apparatus 
box) and that evoking response 11 (the continuation of the appa- 
ratus box plus the proprioception of response I) is probably much 
greater than the difference between the traces evoking II and III 
which are to a large extent a continuation of the trace evoking II, 
and so are less different. This means that the d between I and III 
(or III and I) is 4 -}- 2 = 6; between I and IV (or IV and I), it is 
4 + 2 + 1 =7; and between II and IV (or IV and II), it is 

2 + 1 =3. Thus the generalization from I to III would be: 

bEr = 2.16 X 
= .03. 

We are now in a position to observe how stimulus generalization 
operates. The details of this arc explained in Table 13, where the 

* Presumably this slower apparent generalization fall towcird I is due to the greater 
stimulus intensity (V) of the early part of the stimulus trace, which, strictly speaking, 
is not generalization at all. Equations 50 and 51 are coarse molar makeshifts which we 
will use until wc know more about V and related matters. 
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four values of the final gradient of reinforcement appear in bold- 
faced type in the separate lines, and the dependent generalization 
values of each are in ordinary type in the same line. It will be 
noted that the first of the four gradient reinforcement values 
(2.16(r) can generalize only toward IV and that the fourth (2.67o') 


TABLE 13. The steps used in computing the theoretical mean reaction latencies at the 
response points of a four*lin1c homogeneous response chain. The theoretical stimulus 
generalization reaction potentials (sEr) arc calculated from the gradient*of-rcin- 
forcement values, the latter of wluch arc shown in bold-faced type. 


d values 

4 

2 

1 

1 

Response number 

I 

11 

III 

IV 

Based on 9" delay 

2.16 

14 

.03 

.02 

Based on 6" delay 

.57 

2.25 

57 

.28 

Based on 3" delay 

.30 

1 21 

2.41 

1.21 

Based on 0" delay 

.24 

95 

1.89 

2.67 

Behavior sums (+) of bEr 

2.82<r 

3 54<r 



Reaction latencies (ata) 

3 51" 

2 19" 

1.91" 

2.30" 


can generalize only toward I, whereas the second (2.25ir) and the 
third (2,41(7) can generalize in both directions. 


Behovioral Sumrealion of Homogeneous Reaction Polentiols 

responses in 

the primary reinforcement gradient reaction of 2.16<r and that the 
tZut^rjr’“" - an evokefstul! 

means that equaJn U wiS^^T'" <+)• This 

column I 2 1 fi anri fn ( u- u two of the numbers in 

ntrr.30(whrchXdst69f'‘^^^^ 

yields 2.82) as r 3 /"d -24 (which 

behavioral summations if thel t‘ 

in a parallel manner in the sam^ 'a recorded 

values will show what remains of tte'^alenTor’^f 

the end of the chain process. The Eradi™tn L I’'' "*' 

except for the relative size of the^o extomn f unrecognizable 

taken by themselves, and of the two middle ^ ®"d IV 

themselves. Still, as in the gradient of reinforr'^" '“ rs 

III > 11. ® “ reinforcement, IV > I and 
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Chaining Reaction Latency 

Finally, these summated reaction potentials must be converted 
into the corresponding reaction latencies, in order that we may 
have a directly observable empirical indication of the validity of 
our theoretical derivation. Gladstone et al. (7) have suggested the 
general nature of the relation between bEr and stR. But their equa- 



r/o vn E 42. Graph showing (hoocctical rcaciioa latency at the four reaction points of 
a homogeneous chaining process with terminal reinforcement. 

tion was for a different response and evidently yields too small 
values. However, an adaptation, 

will serve temporarily as a first approximation. The substitution 
one at a time of the values in the next-to-last row of Table 13 yields 
the rough theoretical latency values in seconds appearing in the 
last row. In Figure 42 a graphic representation of these values 
shows that the gradient slopes upward at each end. Table 13 and 
Figure 42 both give the latency at IV as less than at I, and that at 
III as less than that at II. Presented formally these four critical 
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relationships are: 

I > IV, II > III. I > 

The preceding considerations lead us to our next theorem; 

theorem 32. The latencies of the responses in a simple Jour -link 
homogeneous reaction chain with terminal reinforcement shm on 1^ 
average the following fairly stable relationships: I > IV, II > Ul, 
I > II, and IV > III- 

We turn now to the question of the empirical validation of this 
deduction. Arnold (?) performed the experiment described above 



ORDER OF MANIPULAMOA PRESENTATION 

r 1 o u It B 43. Empirical mean reaction latency curve for homogeneous response chain- 
ing under food reward and terminal reinforcement, based on trials 2-10. Adapted from 
Arnold {f, p. 356). 

based on food reinforcement, and, using a separate group of ani- 
mals, a parallel experiment in which the reinforcing agent was 
the cessation of a weak electrical shock to the rat’s feet. A graphic 
representation of his food-rcinforcement results appears in Figure 
43. There we see that the four reaction latencies in the chain de- 
crease in the order I, IV, II, and III, as in the theoretical deduction 
(Table 13). However, a comparison of Figures 42 and 43 indicates 
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that the deduction is not exact; our constants are probably incor- 
rect, though the general relationship evidently holds. This degree 
of agreement between experiment and theory is, perhaps, as close 
as may reasonably be expected in the present early stage of the 
science. Probably the most significant outcome of the analysis is an 
understanding of the detailed reasons for the characteristic trans- 
formation which the gradient of reinforcement undergoes (a reduc- 
tion in the latencies of II and III as compared with those of I and 
IV) through the influence of stimulus generalization. As a matter 
of fact, this same general picture is seen in the results of Arnold’s 
electric shock experiment already mentioned (7), as well as in other 
evidence to be presented later (p. 174). 

Heterogeneous Response Chains with Terminal Reinforcement 
At this point in our exposition we turn to a modification of the 
food-reward experiment just described. In this investigation Arnold 
used the same apparatus as that shown in Figure 40 except that in 
place of the four identical press buttons there were four manipu- 
landa, each one different in appearance from the others and each 
one involving largely different response behavior (2). These were: 
a high horizontal bar, a low horizontal bar, a vertical bar, and a 
watch chain suspended from above ( 8 ). Now, since the reactions 
to these manipulanda are essentially different from one another 
their traces will be essentially different. Because of this the d value 
between what evokes I and what evokes II will be 4, the same as 
before, but the values between II and III, and III and IV, are both 
assumed to be larger than in homogeneous chains, and to be equal 
at 3. These values are formally shown in the first line of Table 14. 

Generalizing on these and some earlier considerations (p. 161), 
we arrive at the following theorem: 

THEOREM 33. The generalization-difference (d) values in four-link 
homogeneous chaining follow a progressive decrease from the beginning 
to the end of the chain^ whereas in four-link heterogeneous chaining the 
differences begin with the same relatively large value as that in homo- 
geneous chaining, and, following a slight early fall, remain constant 
thereafter. 

The empirical verification of Theorem 33 must be indirect and 
rather slow in becoming clearly positive or negative. 
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Substituting these d values and the gradient-of-reinforcement 
values appropriately in equations 50 and 51, we calculate the 
generalization values shown in Table 14. But instead of summating, 

TABLE 14. The steps of computing the theoretical mean reaction latencies at the 
response points of a four-link heterogeneous response chain with terminal reinforce- 
ment. The theoretical gradicnt-of-reinforccmcnt values are in bold-faced type. The 
generalization (bEr) values related to each arc shown in the same lines in ordinary 

type- 


d values 

4 

3 

3 


Reponse number 

I 

11 

III 

IV 

Based on 9" delay 

2.16 

.14 

.02 

.002 

Based on 6" delay 

.57 

2.25 

.28 

.04 

Based on 3" delay 

.21 

.86 

2.41 

.30 

Based on 0" delay 

.08 

.34 

.95 

2.67 

bEr interference values (-<-) 

1.55* 

1.25* 

1.50* 

2.47* 

Reaction latencies (atR) 

12 11" 

18.90" 

12.96" 

4.62" 


these generalized reaction potentials are all assumed to be different 
and therefore to interfere (-«-) with one another. 

Accordingly all the theoretical gradient-of-relnforcement values 
of Table 13 will hold for the present project, but the generalization 
values are for the most part different because they are usually based 
on different d's. And, necessarily, the modes of determining the 
joint bEs’s of the various columns of data are quite different; the 
several generalized values are withdrawn (-»-) from the basic 
^adicnt-of-reinforcemcnt value because presumably different reac- 
tion potentials interfere rather than summate. Consider column I 
as an example. Using equation 13, we have, 


2.16 .57 = 1.76 

1.76 .21 = 1.61 

1.61 -i- .08 == 1.55 


Simlar coi^utations were performed for each of the other three 
columns o Table 14. the results of which appear as the next-to-last 
row of values .n that table. Then these reaction potentials were 
converted into reaction tocneics by means of equation 52; they 

an^thaTA J ni'T “ "-"y seen that I > lA 

forlmen^ir *= S>-4dient of rein- 

forcement. It may also be noted that II > I and that HI > IV. 
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From the preceding considerations we arrive at our next theorem: 

THEOREM 34. The latencies in a simple Jour-link heterogeneous 
reaction chain with terminal reinforcement will present the following 
stable quantitative relationships: I > /F, II > III, II > /, and 
III > IV. 

One empirical test of the soundness of the above theoretical 
deduction is furnished by an experiment performed by Arnold (2). 
The relevant empirical latency results are: 

I II III IV 
4,24" 5.16" 3,56" 1.92" 

It may be seen at a glance that here also, 

I > IV, II > III, II > I, and III > IV. 

The relationships specified in Theorem 34 hold, but one unspecified 
relationship, that between I and III, does not. We note, moreover, 
that all of the values in the theoretical deduction are much larger 
than the corresponding ones in the empirical findings. It is clear 
from this that something is defective; presumably this is with the 
constants utilized in the deduction. On the other hand, the close 
agreement of the general shapes of the functions when represented 
graphically, especially regarding relationships specified in the 
theorem, suggests that important elements in the theory correspond 
to fact. Other relevant data bearing on this type of chaining will be 
presented later (pp. 175 ff.). 

Homogeneous Response Chains with Serial Reinforcement 
Our theoretical analysis of behavioral chaining now turns to the 
form we have called serial reinforcement. In this situation, it will 
be remembered, the animal receives food immediately following 
each of its four responses. If a separate gradient of reinforcement is 
set up at each reward point it is evident that there will be four of 
these gradients, though several will be incomplete; and that they 
will summate, presumably as shown in Table 15. This will yield a 
gradient of serial reinforcement. The last row of Table 15 shows that 
the summation gradient of serial reinforcement is convex upward, 
and that its value at I is larger than that at IV — an almost com- 
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plete reversal of the four gradients of reinforcement from which it 
is derived. 

T A B L E 1 5. Derivation of the gradient of serial reinforcement from the summation ( +) 
of the four gradients of reinforcement produced by the feeding after each act m a 
four-link chain. 

Points of reinforcemerit I IJ 

1st reinforcement 1.000 

2nd reinforcement .617 1 000 

3rd reinforcement 380 617 1.000 

4th reinforcement 234 380 617 ^ .000 

Gradient summation (+) 1 96 1 80 1 52 1.00 

+ initial 2.0 «r reinforcement 3.31 3.20 3.01 2.67 


Generalizing on the preceding considerations, we arrive at our 
next theorem: 

THEOREM 35. When reinjorcements Jollow each oj the successive 
responses oj a behavior chain a series oj overlapping gradient-oj- 
reinjoreement sections are generated, the summation oj which produces 
an upward-sloping and upward-arching serial-reinforcement gradient. 

We proceed now to calculate as before the generalized reaction 
potentials based on the serial gradient. These also are found by 
means of equations 50 and 51. The values arc given in Table 16. 
Summating (4*) the values in the respective columns we have the 
reaction potentials in the next-to-last line. Substituting these bEr’s 
in equation 52, we secure an approximation to the theoretical reac- 
tion latency values. The theoretical latencies, in the last line of 
Table 16, show the same upward sloping of the two end sections 

T A B LE 16 Steps in the computation of the theoretical mean reaction latencies at the 
response points of a four-lmk homogeneous rcspon*c chain when integrated by renal 
remforcemern. ^e scrial-rcinforcement gradient values arc in bold-faced type, as 
taken from Table 15. The generalization values related to each arc shown on the 
same lines in ordinary type. 


Response number 
Based on 9" delay 
Based on 6" delay 
Based on 3" delay 
Based on 0" delay 
Behavior sums (-f) of bEr 
R eaction latencies 
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as appears in Figure 42. But the present results differ from the 
computations from which Figure 42 was derived, as is to be expected 
from the fact that they are based on distinct types of gradient in the 
general slope downward from IV to I. 

From the preceding considerations we arrive at our next theorem: 

THEOREM 36. The latencies in a simple four-link homogeneous 
reaction chain with serial reinforcement will show the following stable 
relationships between the extremes of the chain and the links adjacent 
to the extremes: IV > /, III > If I > If and IV > III. 

Turning to the matter of empirical validation, we have another 
relevant investigation by Arnold {4). Since this experiment was 
performed at the University of Nebraska, the apparatus used in 
Arnold’s earlier investigations was not available. Accordingly he 
carried out this later work on a very different apparatus, to super- 
ficial appearance at least. This was mainly a Skinner type of box in 
which there was a shutter shielding a single manipulandum — a 
bar which when pressed upward delivered a pellet and at once 
automatically withdrew through the wall of the chamber. The 
shutter was lowered while the rat was eating the pellet. Then the 
shutter was raised for the next trial, and so on to the fourth trial. 
It is clear that these external stimuli were by no means similar to 
those of the three experiments previously performed by Arnold 
with the car arrangement, though obviously this experiment in- 
volved homogeneous serial reinforcement. Unfortunately we do not 
know enough about chaining and behavior generally to say what 
effects these changes in apparatus and technique would produce. 

Arnold’s comparable experimental results were; 

I II III IV 
1.72 1.63 2,18 2.39 

Despite certain deviations in the experiment, as noted above, its 
outcome was fairly close to the theoretical expectation indicated 
in Table 16. An inspection of the above data shows that all four of 
the relationships specified in Theorem 36 hold, 

IV > III, III >11, I > II, and IV > III, 

though one other, that between I and III, does not. For some 
unknown reason the theoretical values are in general much nearer 
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the comparable empirical values in size than has been the case in 
several such comparisons. Fortunately there is other re's™"' 
empirical evidence on homogeneous serial reinforcement (p. 180). 

Heterogeneous Response Chains with Serial Reinforcement 
Our fourth and final case of simple four-link response chains con- 
cerns a heterogeneous beha^or chain with serial reinforcement. 
From the preceding three presentations we have all of the quanti- 
tative accessory elements which will be necessary for the derivation 
of this one. We have already considered both heterogeneous re- 
sponse chains and serial reinforcement gradient values in Tables 14 
and 15 respectively, and the d values also in Table 14. Wc now 
combine them appropriately in the computations of the values for 
Table 17. As usual in such tables, the critical reaction latencies arc 
given in the last line. Here wc sec the latencies generally increasing 

T A B L E 17. Steps in the computation of the theoretical mean reaction latencies at the 
response points of a four-link heterogeneous response chain when integrated by lerial 
reinforcement. The serial reinforcemeat gradient values are in bold-faced type. The 
genecaUzaUon values related to each are shovm on the same line In ordinary type. 


d values 

4 

3 

3 


Response number 

I 

II 

III 

IV 

Based oo 9" delay 

Based on 6" delay 

3.31 

.21 

.03 

.003 

m 

3.20 

.40 

.05 

Based on 3" delay 

.27 

1.07 

3.01 

.38 

Based on 0" delay 

.08 

.34 

.95 

2.67 

Residual bEr (— ) 

2.7W 

2 25^ 

2.18<r 

2.4l<r 

Reaction latencies (stR) 

3.81" 

5 €0" 

5.98" 

4.86" 


frcfm \ to W and from 11 to 111, which shows the influence of the 
serial reinforcement gradient, with the two middle values higher 
than the extreme ones. Tlus means that the two end sections of the 
series slope downward as in Table 14, demonstrating the geneval 
influence of the interfering heterogeneous generalizing values. 

Formulating our general conclusions from the preceding theo- 
retical computations, we arrive at our next theorem. 

THEOREM 37. The latendes in a simple Jour-link heterogeneous 
behavior chain with serial Teinjorcement will tend to have the following 
Jour relationships between the extremes of the chain and the links ad- 
jacent to those extremes: IV > /, /// > If // > / ^nd III > IV 
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A Study by Arnold (3) supplies empirical verification of this 
theoretical deduction also. Fortunately in this experiment Arnold’s 
original apparatus (Figure 40) was used. His relevant results were 
as follows: 

I II in rv 

2.55" 8.59" 9.13" 4.18" 

A comparison with Table 17 shows a reasonably close agreement, 
including all the points specified in Theorem 37, i.e., 

IV > I, HI > II, II > I, and III > IV. 

The theory in so far is substantiated. Other verifying evidence will 
be presented later (pp. 180 fF.). 

A Form of Trial*and*Error Behavior Chaining 

In the preceding pages of this chapter we have considered four 
types of very simple four-link behavior chaining. Now it will be 
our task to observe not only that there exist behavior chains of 
various numbers of links from short up to very long series, but that 
there exist chains involving the greatest variety of behavior and 
circumstances of evolution. This form of behavior chaining is one 
which in its acquisition demands a conspicuous element of trial-and- 
error learning. In the considerable amount of experimental investi- 
gation winch has been devoted to it this has sometimes been called 
compound trial-and-error learning {21). 

A four-link linear rat maze with four choices at each choice point 
was used in several studies specifically concerned with compound 
trial-and-error learning (d, P, 13^ 14, 21). This maze is shown in 
Figure 44. At each choice point pressure from the animal’s body 
easily pushed up one of the four sloping doors and permitted loco- 
motion down the passageway to the next choice point, and so on 
to the end of the maze at F. Electrical contacts at all the maze 
doors recorded on the polygraph not only the doors actually 
passed under but, in proper order, all the doors erroneously 
attempted. 

At the beginning of such learning the probability of the correct 
door being tried is in general a matter of chance, i.e., one in four, 
but as learning continues the proportion of false to correct choices 
becomes progressively less. Moreover, the nature of the three types 
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of error is clearly indicated and in some cases may be very sig- 
nificant. Thus the phenomena which will mainly concern us in this 
type of learning pass from response latencies, the only criteria 
available in the validation of theoretical deductions involving 
simple chaining, to the nature of the responses themselves, i.c., 
whether they are adaptive (right, R) or unadaptivc (wrong, R). 



As might be expected this marked difference in the type of phe- 
nomenon to be taken as an indicator of the state of the c hai nin g 
process involves the use of different methods of quantification. 

Hensegeneous Linear M=z. Chaining »i,h Terminal Reinfarcement 

We saw above (p. 169) that Arnold was able to carry out experi- 
ments on simple chain mg with two vem .uir ' out cxperi 

ratus. Now we shall obslve to all f^^o Th“ “I 

there considered may occur in the linear ma chaining 

/ kArUi m me linear maze shown in Figure 44. 
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For example, we have homogeneous chaining with terminal rein- 
forcement if we permit the animal to pass only through door 1, 
say, at all four choice points with food (F) reinforcement at the end. 
This passageway, projected on the floor plan of the maze, is shown 
in Figure 45. There it may be seen that when running the true 
course the rat must make the same type of turn before it advances 
in the maze, since each time it must pass through the short alley 
of the partition preceding and following each choice point. Con- 
sidered as a whole the locomotor behavior at I, from the first to 
the second partition, is the same as that at II, from the second to 
the third partition, and so on to the end. In the present coarse 
analysis these four locomotor sections may be considered as 
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FIGURE 45. Diagrammatic representation of the floor plan of the compound trial-aod- 
error learning rat maze. The animab were placed in the maze at S. P represents the 
partition in each section of the maze with a 2.5*inch passageway in its center forcing the 
animals to make their choices of doors at choice points 1, 11, III, and IV from a com- 
parable position in the runway. The doors are numbered from 1 to 4 from d.e top down. 
The animals were fed upon reaching G. The dotted line represents the correct pathway 
through the maze for an animal whose learning task was to choose door No. 1 at each 
choice point between S and G and to make the same reaction (homogeneous) at each 
choice point. Analogous pathways were followed by animals whose task was to leam to 
choose doors No. 2, 3, and 4 respectively. From Sprow (27, p. 198). 

homogeneous even though at first considerable more or less random 
trial-and-error behavior usually intervenes between the passage 
through the partition and the successful door choice at a given 
choice point. 

As learning progresses the various R’s involved in the trial-and- 
error process will gradually **short circuit” (Theorem 23 B) and 
drop out of the sequence, and the traces of the stable or uniform 
acts of the sequence as required by the apparatus will gradually 
become reinforced to the acts in question, quite as in Arnold s 
parallel experiment. This means that we shall again find the 
generalization of responses and their summation (4") in simple 
chaining, exactly as shown in detail in Table 13. It is believed that 
habituation of the animals to the maze preceding the actual train- 
ing, and the consequent secondary reinforcement, gave the cquiva- 
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lent of the initial 2.0cr reinforcement operative in Arnold s simp e 

chaining experiments. , 

But in order to calculate verifiable theoretical results from the 
,E, values of Table 13, we must convert them into equivalent error 
or R. values. Fortunately Sprow (2?) published an equation formu- 
lated by H. G. Yamaguchi. which purports to give R - ffshu). 
Transposed, this is: 

(53) 




Substituting the next-to-last row of theoretical values in Table 13 
one at a time in equation 53, we have the following theoretical 
error (R) values: 

I II III IV 
1.396 .760 .638 .834 


An examination of these results shows that as usual, 

I > IV, II > III, I > II, and IV > III, 

which is the substance of Theorem 32, though in a strikingly 
different activity as superficially viewed. 

Turning to the matter of an empirical check of this theoretical 
deduction, we find Sprow (27, p. 203) reporting errors (R) as 
follows: 

I n in IV 

1.305 .845 ,720 .970, 

which shows that: 

1 > IV, II > III, I > and IV > III. 

Thus Theorem 32 receives further confirmation on all four points. 

There is another bit of incidental evidence regarding this deduc- 
tion. This comes from an experiment by Montpellier (20), orig- 
inally designed to solve a rather different problem. The apparatus 
consisted of three essentially similar sxx-Unk linear mazes with 
two choices at each link. Montpellier gave one trial each day to a 
total of 42 blinded albino rats divided into groups of roughly 
equal numbers, on these mazes, the ground plan of one of which is 
represented in Figure 46. Calculating the weighted averages for 
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the mean number of trials required to eliminate the erroneous 
entrances at each of the six choice points, we secure the following 
values: 

I II III IV V VI 

5.01 3.84 3.26 2.59 3.33 3.67 

The outcome of this experiment when represented in a way 
parallel to our previous custom shows the two ends of the series 
tilting upward like Figures 42 and 43, and that: 

I > VI, HI > IV, I > II, VI > V. 

Thus in all respects these results from a fairly conventional form 



viouRE 46. The pattern of a typical diamond linear rat maxe used by Montpellier 
{20). The food reward was given in the animal’s living cage which was attached for this 
purpose to the right*hand extremity of the maze. Note that all the correct reactions are 
rxght'ttuning. 

of linear maze turn out to follow and thereby empirically to verify 
still further the secondary law presented in Theorem 32. 

Heterogeneous Linear Maze Chaining with Terminal Reinforcement 

The apparatus shown in Figure 44 was also used in heterogeneous 
chaining with terminal reinforcement. This differs from homogene- 
ous chaining only in that the correct pathway through the maze 
involves passing through a different door at each choice point, 
which of course means that the correct turning act before going 
under the door at each point is different, much as in the correspond- 
ing experiment by Arnold. Following the general expository logic 
of the preceding section, we find the theoretical analysis of this 
type of chaining the same as that displayed in Table 14. 

But in maze chaining each of the values of Table 14 appearing 
in the rows with bold-faced type has a distinctive, obser\’ablc 
meaning in the outcome of the experiment, cither as a correct 
response or as one of three different types of erroneous responses. 
The point is that in this procedure each type of generalization 
tends to produce erroneous reactions of a distinct form. This means 
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that instead of simple interferences pooled (~), ^vc must convert 
each of the different reaction potentials into an approximation 
to the equivalent probability of the occurrence of each type of 
response. We do this on the analogy of the computations of the 
statistical probability of differences between means. Here wc shall 
make the simple provisional assumption that the standard deviation 
of all four bEr values in each column is 1.0. It follows that the 
standard deviation of each difference between these values must be 
1.0 X V2, or 1.414. Taking the first pair of values of column I, 
Table 14, 2.16 and .57, we find the difference to be 1.59. The ratio 
of this to its standard error is. 


1.59 

1.414 


1.124. 


Looking up this value in a normal probability table, wc discover 
that with two forces of 2.16<r and .57^ opposing each other, the 
2.16.x wiU dominate 86.96 per cent of the time, and the .57ir will 
dominate 13.04 per cent of the time. 

^ But there is yet to be considered the simultaneous competition 
m this cokmn of the .21<r and the .08ix, with the two values just 
nrnh^huv' Calculating the 

ri6r'li‘d“nf ““ “H"'" of 2.16a and .21a, and of 

91 61 and R 10 ^’ yields the probabilities of 

7 of Not whi" probabilities of 92.94 and 

wm h^revl K 'he ratio of 86.96 to 13.04 

have the probalihlict enter. We accordingly 

86.96 vs. 13.04, 

91.61 vs. 8.39, 

92.94 vs. 7.06^ 

all based on 2.16a. But the nrohaKll;,.. 

different in each case. Therefore the 8 w'^and Th "tof h° 

respond to the 13.04. This is rectified by 

91.61 •.8.39: :86.96;x. 

92.94:7.06: :86.96:x. 

Solving these proportions, we find the resneeil,, , c 

*c respective x values to be 
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7.96 and 6.61. We now have the following four probability values 
all in their true proportions: 

86.96 

13.04 

7.96 

6.61 

Total 114.57 

But the total probability, however many possibilities exist, must 
always amount to 1.00. This means that the values are too large. 
Accordingly we divide each of the four values by 1.1457 to reduce 
them to the proper size. As a result of this division we have: 

75.90 

11.38 

6.95 

5.77 

Total 100.00 

We now see that at I there are 75.90 chances in 100 that the correct 
response will be evoked at that choice point; that there are 11.38 
chances in 100 that the response proper at II will be evoked at I; 
that there are 6.95 chances in 100 that the response proper at III 

TABLE 18. The theoretical per cent of correct response evocations (bold-faced type) 
and the per cent of erroneous response evocations of the three forms in four-link 
heterogeneous linear maze chaining with terminal reinforcement. Derived by compu- 
tation from Table 14. 


Choice points 

I 

!I 

III 

IV 

Responses correct if given at I 

75.90 

5.33 

3.68 

2.75 

Responses correct if given at II 

11.38 

73.30 

5.45 

2.92 

Responses correct if given at III 

6.95 

14.25 

77.16 

4.42 

Responses correct if given at IV 

5.77 

7.12 

13.71 

89.91 

Total per cent erroneous responses at 

each choice point 

24.J0 

26.70 

22.84 

10.09 


will be evoked at I; and that there are 5.77 chances in 100 that the 
response proper at IV will be evoked at I. 

We make similar computations from the data in the second, 
third, and fourth columns of Tabic 14, and record all these theo- 
retical values in Table 18, with the correct probabilities set in 
bold-faced type and the error probabilities in ordinary type. The 



A BEHAVIOR SYSTEM 


178 

total erroneous response per cents at the several choice points are 
given in the last row of values of this table. A glance at these 
values reveals that theoretically in maze learning the score in error 
per cents is: 

I > IV, II > III, II > I, and III > IV, 

exactly as was deduced for simple chaining (Theorem 34). 

Turning now to the question of the empirical soundness of the 
deduction, we find a study exactly on the point. An inspection of 
Hull’s relevant published figures (73, p. 123) shows that, 

I > IV, II < III, II > I, and III > IV, 


which agrees with Theorem 34 on three of the four points. It 
must be noted that the disagreement, which involves choice points 
II and III, is with the first-choice errors but not with the total 
errors. Moreover, exactly the same disagreement between theory 
and empirical fact is found in an experiment by Hill {13, p. 119), 
so that the inconsistency can hardly be due to sampling. Up to 
the present time we have not made any detailed distinction between 
total errors and first-choice errors in our deductions. It is probably 
too early to press the theory to that amount of detail. 


Generalization in Linear Moze Heterogeneous Trial-ond-Error Chaining 
The fact that separate heterogeneous generalization gradients in 
terms of response probability have just been deduced in some detail 
perirms us to rnake some additional comparisons with empirical 
r ttispection of the row of response probabilities 

(“^dating generalizations, Table 18) shows 
1th eh from IV to I is progressively less at 

each cho.ee Pomt; ne.. that the gradient curvature is in general 

ttical reT t in the row of theo- 

retical response probabilities correct if given at I (perseverative 
generalizations). Both have been mnl.i,, j . '■Perseverauv 

deductive procedure. assumed in our theoretical 

"'iT experimental results (13 p 127), 

we find that the corresponding generalization data are- 

A,.t,d^tmg^nzalio^s: 12.8, 19 1 37 1 65 9 

PtTsevirattvi generalizations: g uY gj’ 
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Thus the theory is validated for the most part in regard to the 
curvature of the two types of generalization gradient. 

A second examination of the rows of responses which would be 
correct if given at I and if given at IV of Table 1 8 will show that the 
figures in the first or perseverative row of theoretical erroneous 
generalization values are much smaller .than those in the last row 
of theoretical erroneous antedating generalization values (correct 
if given at IV). This also was tacitly assumed by our choice of the 
exponents in equations 50 and 51. Glancing again at the two rows 
of corresponding empirical generalization values given in the pre- 
ceding paragraph, we see that the antedating error values are dis- 
tinctly larger than are the corresponding perseverative error values. 
Thus our original assumptions as represented in equations 50 and 
51 appear to be fully substantiated. 

A third theoretical matter here concerns the progressive influence 
of the learning process on the slope of a given generalization 
gradient. Consider, for example, the antedating gradient correct 
when the response is given at IV. This is really a discrimination 
gradient positively reinforced when the response is given at IV 
and not reinforced, i.e., partially extinguished, when it is incorrectly 
given at I, II, and III. It follows that with practice the gradient 
will rise at IV and will fall relativelyat III, II, and I. Empirical 
evidence on this point is available in the study already cited. The 
empirical values just mentioned were the average results from a 
total of 50 trials. Computations from the published tables (7S, 
p. 124) show that the corresponding mean antedating generaliza- 
tion gradients for the first and second ten trials respectively are: 

I II III IV 
First 10 trials: 20.42 21,11 23.47 36.81 

Second 10 trials: 14.31 21.67 39.58 63.47 

These results indicate that increase in the training raises the value 
at IV from 36.81 to 63.47, and lowers that at I from 20.42 to 14.31, 
which verifies the theoretical expectation. 

Homogeneous Linear Mare Chaining with Serial Reinforcement 
The experimental technique of the homogeneous linear maze 
chaining with serial reinforcement was exactly the same as that of 
homogeneous linear maze chaining with terminal reinforcement, 
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except that a pellet of food was found by the animal at once after 
it had passed through the door at each of the four choice points 
(Figure 45). This means that the main theoretical analysis pre- 
sented in Table 16 will also hold in the present situation. There is 
this difference, however: we must convert the reaction potentials 
in the next-to-last line into errors (R) instead of latencies (stft). To 
do this we must again use the Yamaguchi-Sprow equation (53) 
(27). This yields the following values: 

I II III IV 
.572 .408 .434 .689 

which, as usual, yield the following inequalities: 

IV > I, III >11, I > n, and IV > III. 

This experiment was performed by Gladstone (d). His corre- 
sponding empirical R values were: 


I II III IV 

.528 .244 .356 .776, 

IV > I, III >11, I > II, anjj lY ^ III 


Accordingly Theorem 36 appears to be validated for homogeneous 
linear maze chaining with serial reinforcement. 

Heterogeneous Linear Maze Choining with Serial Reinforcement 


We perform the theoretical computations for heterogeneous linear 
maze chaining with serial reinforcement from Table 17 by the 
R°Tt, connection with the construction of 

"'“cctical error values as thus derived arc 
given m the last line of Table 10 An r l i 

shows that, "" inspection of these values 


IV > I, HI > 11, II > i^ 


and III > IV, 


chaining%t™'^ cnipirical findings for simple 

linIrrTL"nrwe’'LrrrtSy byX' this phase of 
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choice points were, on the average, as follows {14, p. 20): 

I 11 III IV 
41.2 54.5 58.4 48.3 

An inspection of these values shows that, 

IV > I, III > II, II > I, and III > IV, 

quite in accordance with theoretical expectation. However, the 
predicted values are much smaller than are the experimental ones. 

TABLE 19. The theoretical per cent of correct response evocations (bold«faced type) 
and the per cent of erroneous response evocations of the three different forms, 
heterogeneous chaining with serial reiidbrcement. Derived by computation from 
Table 17. 


Choice points 

1 

II 

III 

IV 

Response correct if given at I 

73.74 

1.58 

1.96 

2.73 

Response correct if given at 11 

3.69 

90.07 

2.97 

2.95 

Response correct if given at III 

1 50 

6.36 

88.16 

4.97 

Response correct if given at IV 

1 06 

1.99 

6.90 

89.35 

Total per cent erroneous responses at 





each choice point 

6.25 

9.93 

11.83 

10,65 


Here again both theory and empirical fact permit the distinction 
of the different types of erroneous responses as related to chaining 
generalization. All the theoretical expectations to be verified in 
the case of heterogeneous linear maze chaining have been observed 
in this situation, hut with one curious addition. It will be noticed 
in row I of Table 19 that the generalization falls to 1.58 at II, then 
rises to 1 .96 at III, and to 2.73 at IV. Glancing at a figure published 
in the study referred to above (74, p. 21) we see that the tendency 
for an upward tilt of this generalization gradient as expressed in 
percentages is anticipated by theoretical expectation. But this is 
too fine a point to be elaborated in detail at the present immature 
state of the science. 

Difficulty in Heterogeneous Chain Learning as a Function of the Length of 
the Chain 

At this point in our analysis of forms of heterogeneous behavior 
chain learning with serial reinforcement, we consider a new 
aspect. This concerns the ease or difficulty of the learning as 
dependent on the length of the chain. Because of its relative sim- 
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plicity we shall consider the difficulty or ease involved in learning a 
three-link chain. Table 20 shows the response behavior characteristic 
of this form of learning. The series gradient is the same as that used 
in Tables 17, 18, and 19 except that it has only the first three links. 
The generalizations are based on the same exponents but the values 
in Table 20 differ somewhat because some of the numbers involved 
come in different combinations. Finally, the last two rows of the 
table give respectively the equivalent correct and erroneous per 
cent of responses at each of the three choice points. 


T A B L E 20. The error characteristics of a thrcc*link heterogeneous behavior chain with 
serial reinforcement. 


d values 

4 


3 

Choice points 

I 

II 

III 

Response correct if given at I 

3.20 

.20 

.03 

Response correct if given at II 

.76 


.38 

Respowe correct if given at III 

.24 

.95 

2.67 

Per cent of correct responses 

94.10 

90.72 

91.94 

Per cent of erroneous responses 

5.90 

9.28 

8.06 


First it wiil be noticed that in erroneous responses: 

ni >1, II > I, anti n > III_ 
just as we expect from such chaining theory. But our present 
concern is mainly with the number of errors made in the per- 
ormanre of the three-link chain as compared with a four-link 
SaTnfTw" comparing the last row of Table 19 with 

of th ' g'ance will show that at comparable points 

the W C responses are associated with 

eLon fort- “■'= '^hain. One obvious 

iZZ A. ‘ u S“"=>'«=>tion from the e.xtra link of the 

TLtri'rrthe'p^e^zrr-f “ 

next theorem: ® cousiderat.ons, we arrive at our 


chains increase in length, 

d«.“ tarr.£u'2Jt;:.r ” “ •!'; 

F t erroneous responses at the initial 
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choice points of the two chains is, 


6.25 - 5.90 = .35, 


whereas the difference between per cent erroneous responses at 
the final choice points of the two chains is, 

10.65 - 8.06 = 2.59. 

But, 


2.59 > .35. 

Generalizing on these considerations we arrive at our next 
theorem: 


THEOREM 39. As heterogeneous behavior chains increase in length, 
the amount of learning remaining constant, the per cent of erroneous 
responses at the posterior end of the chain increases more rapidly than 
does that at the anterior end. 

Turning to the question of the empirical verification of Theorems 
38 and 39, we find that no evidence exists of the sort with which 
we have hitherto concerned ourselves. For this reason we introduce 
a relatively new consideration. This is that rote learning is a form of 

T A B LE 21. The number of repetitions required to Jeam different lengths of nonsense 
syllabic series. Data from Meuraann { 19 ). 

Number of syllables in series Repetitions 


8 

5.2 

12 

10.4 

16 

17.0 

18 

21.5 

24 

30.0 

36 

32.5 


heterogeneous serial chaining, the reinforcement in this case being of 
the secondary variety. This arises from the subject’s discovery of 
the correctness or incorrectness of his response soon after it has 
been made. 

Theorem 38 is substantiated by great amounts of experimental 
work on rote learning. Relevant data are found in studies by 
Ebbinghaus (5), Meumann (7P), Lyon {18), and Hovland {10). 
Table 21 gives an example from Meumann. 
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Theorem 39 is substantiated by Hovland’s rote learning results 
shown in Figure 47. Despite a certain amount of irregularity 
evidently due to sampling limitations, it is clear that the posterior 
ends of the three curves differ more from each other on the average 
than do the anterior ends. 



A Mixed Form of Behavioral Choining-lhe Double Alternation Experiment 

at a°Ze “ “"8'" ‘yP<= °f 

It is obvious.’ hLevcr°7ZZr “"^‘"“Seneous aione. 
types of resnnncM f * II ^ single behavior chain the two 
may bo co7bineci rZ *' '•‘detent types of reinforcement) 

here for the anaiy5is"of bmZZZTZ' Zb’I!' 

Because it was one of the carliZf^™™ behavioral chaining, 
we have chosen what Hunter investigated 

meant the combination of twT^tZlZ h 
A A and/or B B into a four f , homogeneous chains as 
being different frl the ™s Por'^'"’ M ® '>'■= B responses 

double alternation cxnerimer,. might perform the 

«I«"ment on the linear maze of Figure 44 
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by requiring the rat to go through the first door at I and at II, and 
through the fourth door at III and at IV. The combination of 
choosing the first and fourth doors of the maze constitutes the 
heterogeneous element in the chain. 

We proceed in this analysis on the basis of terminal reinforce- 
ment, the form usually employed. The first two d values will be 


TABLE 22. The theoretical analysb of the mixed form of joint homogeneous-het- 
erogeneous behavioral chaining with terminal reinforcement, known as double 
alternation. 


d values 

4 


2 

3 

Response number 

I 

II 

HI 

IV 

Response reinforced at I 

2.16 

.14 

.03 

.004 

Response reinforced at II 

.57 

2.25 

.57 

.07 

Response reinforced at III 

.30 

1.21 

2.41 

.30 

Response reinforced at IV 

.12 

.47 

.95 

2.67 

i + II 

2.52 

2.34 

.60 

.07 

III + IV 

.41 

1.59 

2.98 

2.84 

Per cent correct responses 

93.2a 

70.21 

95.38 

97.49 

Per cent incorrect responses 

6.78 

29.79 

4.62 

2.51 


4 and 2, as of the homogeneous series, and the third will be 3, the 
difference in stimulation resulting in the change of response at III, 
as at this phase in the heterogeneous scries. The gradient of rein- 
forcement values and the generalization exponents are the same 
as those previously employed. Table 22 was generated on these 
assumptions. Since in this case there are two distinct types of 
response, each column of oEr’s is summated in two portions; these 
are shown in the rows I -j- II and III + IV respectively. The bEb 
summations determine the theoretical per cent of correct or of 
erroneous responses (shown by a probability table) at any given 
choice point, as calculated by the procedure described above 
(p. 177) for the simpler situation of only two alternative responses. 
The percentages arc given in the last t\vo lines of Table 22. 

A glance at the series of theoretical error values shows that there 
is a sharp increase in the number of errors at II, the terminus of the 
first group of homogeneous links of the chain. In the body of the 
tabic we see that this is due to the relatively large generalization 
values in rows III and IV, which summatc to 1.59<r as compared 
with the summation of 2.34<r for the correct reaction tendency. 
The fact that the error maximum falls at II rather than III is 
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evidently due in the last analysis to the principle of greater strength 
of antedating generalization tendencies as compared with persev- 
crative generalizations. 

Generalizing on the preceding considerations, we arrive at our 
next theorem: 

THEOREM 40. In a mixed homogeneous-heterogeneous Jour-link 
linear maze chaining situation^ A A B the learning task is more 
dijjkultjor albino rats to master than that in either the pure homogeneous 
or the pure heterogeneous situationy the maximum difficulty lying 
definitely at choice point 11 where it exceeds that of the pure heterogeneous 
chaining situation, the relationships of the errors being I > IVt 
11 > ///, II > 1, 111 > IV. 

The empirical work on double alternation or alternate repetition 
is fairly extensive. The most comprehensive single quantified study 
as well as the most recent one was reported by Woodbury (22; 23; 
24). He employed a linear maze with four choice points constructed 
on the same general principle as the maze shown as Figure 44 
c/xept that there were only two gates at each choice point. At 
trials 41-50 his animals made the following per cent of errors: 

I II III IV 
11.3 41.3 12.0 2.0, 

I > IV, II > III, II > i^ and IIj ^ 
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possibility of some difference in the stimuli at the various choice 
points, but in the temporal maze all of the choice points constitute 
the identical spot spatially. However, rats are believed to be very 
sensitive to tracking odors. On this hypothesis the choice point 
after one traversal around the right-hand square and up the cross- 
piece will have the odor of the first track, and after the traversal 
of the square twice this choice point will have an even stronger 
odor from the two tracks. This more intense stimulus might lead 
the animal to turn in the opposite direction and go around the 
left-hand square twice. Then of course there is the factor of the 
semi-circular canals; clearly there is a difference in the function of 
these organs when the animal repeatedly passes through the same 
point in space as distinguished from when it performs the same act 
twice and then a somewhat different act twice, as in Woodbury’s 
experiment. 

Hunter’s double alternation temporal maze is obviously not 
identical with Woodbury’s double alternation linear maze. The 
former is probably more difficult for animals to master than is the 
latter. As a sample of Hunter’s remarks, we have (7d, p. 528): 

Rat 1 had run the tridimensional 1 1 r r maze 14 times in 
succession when it was started on the 1 1 r r temporal maze. 
It was given 109 trials on the temporal maze, but never made a 
single correct trial. At no time did the animal respond 1111. 
With but few exceptions the responses were 1 r r r or r r r r. 

Summary 

In summarizing briefly the preceding pages we must emphasize 
that any forms of behavior links whatever may, and do, constitute 
simple behavior chains. The restriction in choice of the types of 
chains discussed above is due to the limited number of those 
experimentally studied in a quanUtative manner so far; and this in 
turn is due to the limitation in the characteristics of the chains 
which make them simple enough for quantitative interpretations 
to be feasible. Even so, we have cited results on the following types: 
isolated acts (Arnold); the compound trial-and-crror chaining of 
the linear maze (Hill), with terminal reinforcement (Sprow), and 
serial reinforcement (Gladstone); the homogeneous chain (Arnold); 
the heterogeneous chain (Hull); the mixed homogeneous-hetero- 
geneous chain (Woodbury); the temporal maze (Hunter); and the 
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rote learning of nonsense syllables (Hovland). Much chaining in 
ordinary life is verbal and so is related to rote learning such as that 
of memorized words: poetry, songs, rituals, prose, and so on. 
Obviously these chains may be as short as two words, as in the 
association experiment; or they may extend to very large numbers 
of links, as in elementary numerical counting. 

As we have shown, the detailed derivation of the several second- 
ary laws of behavioral chaining involves a certain amount of 
complexity, but the chaining laws that have emerged so far are 
mo erately simple even if the modes of their manifestation are 
fairly varied. Stated in terms of errors in four-link chains these laws 
are: 


and “nd strongly to the formula: II > I 

I > n strongly to the opposite formula: 

teLnal“'re!l,r°®“'°“* chains when given 

H > iji °''««'«nt tend strongly to the formula: I > IV and 

serial reinforcOTent'r'H heterogeneous chains when given 
and III > II. strongly to the opposite formula: IV > I 

geneous subchti^of equaTwth'^h”'^'’'^’””®™''™' 

ment, follow the formull t > iv TI "'i^hTrce- 

With the magnitude of II r^i f I ^ ^ ^ 

heterogeneous chaining The ^ 

reinforcement has not vet heer*-*^^^ chaining with serial 

empirically, though the nr' **"^^^igated either theoretically or 

ready means for this on th Principles utilized above offer a 
on the theoretical level. 

Terminal Notes 


the simp., locomotion op , 

Another bit of evidence winch .e. ^ ooal 

retical deduction of homogeneo ^ above theo- 

ment which to superficial aoDea^ ^ lining comes from an experi- 
discussed in the preceding different from those 

B pages. This experiment investigated the 
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speed of locomotion of rats in the approach to food through straight 
20-foot and 40-foot runways ( 11 ). Typical results thus secured by 
five-foot sections of the runways were: 



I 

I! 

III 

IV 

V 

VI 

VII 

VIII 

8-scction runway: 
4-scction runway: 

6.32" 

2.31" 

3.33" 

2.08" 

2.57" 

1.86" 

2.30" 

1.99" 

2.48" 

2.33" 

1.77" 

2.42" 


The relevancy of this experiment to the homogeneous terminal 
reinforcement problem lies in the fact that each cycle of locomotor 
activity is like every other and corresponds to the cycle or behavior 
link of pressing the disks in Arnold’s experiment of this type. The 
mean time required to traverse each five feet of the 20-foot runway 
was as shown in the 4-section line of data given above. This shows 
the same tilt-up at section IV as do Figures 43 and 44: 1.86'' vs. 
1.99". Similarly, the 8-section runway shows a tilt-up at the final 
section: 1.77" vs. 2.42". It is clear, however, that some factor not 
yet known is involved here. This is indicated by the fact that the 
point of minimum latency tends with continued reinforcements to 
approach the middle of the series, though upon partial extinction 
or satiation it again returns to the penultimate response. 

The lack of Homogeneity Within Each link of a Behavior Chain 

Throughout the present chapter we have considered the matter of 
homogeneity and heterogeneity in terms of the behavior of the 
links of a chain as such, without analyzing the homogeneity within 
the links themselves. Thus it ispossiWc to have the links of pressing the 
disk repeated (as in Arnold’s homogeneous experiment, pp. 156ff.). 
even though the movements in the early phase of pressing the disk, 
such as reaching the paw forward toward the disk, are different 
from the terminal movements, such as withdrawing the paw. 
Similarly, the locomotion in a straight line discussed just above is 
homogeneous when treated as analogous to complete stepping 
cycles, even though different legs are involved in doing distinctly 
separate things at different parts of the cycle (77, p. 206). This 
means that there remains the task of considering the behavior 
principles involved in the integration process in smaller parts 
than those considered above. This will be the task of our next 
chapter. 
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rote learning of nonsense syllables (Hovland). Much chaining in 
ordinary life is verbal and so is related to rote learning such as that 
of memorized words; poetry* songs, rituals, prose, and so on. 
Obviously these chains may be ss short as two words, as in the 
association experiment; or they may extend to very large numbers 
of links, as in elementary numerical counting. 

As we have shown, the detailed derivation of the several second- 
ary laws of behavioral chaining involves a certain amount of 
complexity, but the chaining laws that have emerged so far are 
moderately simple even if the modes of their manifestation are 
fairly varied. Stated in terms of errors in four-link chains these laws 
are; 

1. Homogeneous chains tend strongly to the formula: II > I 
and III > IV. 

2. Heterogeneous chains tend strongly to the opposite formula: 

I > II and IV > III. 

3. Both homogeneous and heterogeneous chains when given 
terminal reinforcement tend strongly to the formula: I > IV and 

II > III. 

4. Both homogeneous and heterogeneous chains when given 
serial reinforcement tend strongly to the opposite formula: IV > I 
and III > II. 

5. Mixed behavior chains composed of homogeneous and hetero- 
geneous subchains of equal length, when given terminal reinforce- 
ment, follow the formula: I > IV, II > III, II > I, and III > IV, 
with the magnitude of 11 relatively greater than that found in pure 
heterogeneous chaining. The mixed form of chaining with serial 
reinforcement has not yet been investigated either theoretically or 
empirically, though the primary principles utilized above offer a 
ready means for this on the theoretical level. 

Terminal Notes 

THE SIMPLE LOCOMOTION OF RATS TO A TERMINAL GOAL 

Another bit of evidence which seems to bear on the above theo- 
retical deduction of homogeneous chaining comes from an experi- 
ment which to superficial appearance is very different from those 
discussed in the preceding pages. This experiment investigated the 
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Speed of locomotion of rats in the approach to food through straight 
20-foot and 40-foot runways (77). Typical results thus secured by 
five-foot sections of the runways were: 



r 

II 

III 

IV 

V 

VI 

VII 

VIII 

8-scction runway; 
4-section runway: 

6.32" 

2.31" 

3.33" 

2.08" 

2.57" 

1.86" 

2.30" 

1.99" 

2.48" 

2.33" 

1.77" 

2.42" 


The relevancy of this experiment to the homogeneous terminal 
reinforcement problem lies in the fact that each cycle of locomotor 
activity is like every other and corresponds to the cycle or behavior 
link of pressing the disks in Arnold’s experiment of this type. The 
mean time required to traverse each five feet of the 20-foot runway 
was as shown in the 4-section line of data given above. This shows 
the same tilt-up at section IV as do Figures 43 and 44: 1.86^' vs. 
1.99". Similarly, the 8-section runway shows a tilt-up at the final 
section: 1.77" vs. 2.42". It is clear, however, that some factor not 
yet known is involved here. This is indicated by the fact that the 
point of minimum latency tends with continued reinforcements to 
approach the middle of the series, though upon partial extinction 
or satiation it again returns to the penultimate response. 

The Lack of Homogeneity Within Each Link of a Behavior Chain 

Throughout the present chapter we have considered the matter of 
homogeneity and heterogeneity in terms of the behavior of the 
links of a chain as such, without analyzing the homogeneity within 
the links themselves. Thus it is possible to have the links of pressing the 
disk repeated (as in Arnold’s homogeneous experiment, pp. 156ff.). 
even though the movements in the early phase of pressing the disk, 
such as reaching the paw forward toward the disk, are different 
from the terminal movements, such as withdrawing the paw. 
Similarly, the locomotion in a straight line discussed just above is 
homogeneous when treated as analogous to complete stepping 
cycles, even though different legs are involved in doing distinctly 
separate things at different parts of the cycle (77, p. 206). This 
means that there remains the task of considering the behavior 
principles involved in the integration process in smaller parts 
than those considered above. This will be the task of our next 
chapter. 
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7. Learning within the Individual Behavior Link 


Simple trial-and-error learning, considered in Chapter 2, was de- 
scribed as a process whereby one response (R+) of two fairly 
distinct ones evocable by the same stimulus combination is progres- 
sively strengthened, whereas another (R_) is extinguished. At each 
trial only one act occurs, and it is immediately reinforced (R+) 
or not reinforced (IL.). 

Again, in the learning of reaction chains by heterogeneous com- 
pound trial and error and terminal reinforcement, say, as presented 
in Chapter 6 (p. 165 ff), several behavioral links of the sort that 
are mentioned in the preceding paragraph often are involved in a 
chain and all are reinforced at the end of the sequence by a single 
event (feeding). In the linear maze there considered a given true 
response simply cannot be performed until that of the preceding 
link has been correctly performed, and at the entrance to each 
section of the maze a trial-and-error process of door selection must 
occur before the animal can continue its forward locomotion. This 
means that there is a checking of forward movement at every wrong 
response, and at least a secondary reinforcement (based on the 
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tion within the separate act involved in simple trial-and-error 
learning (Chapter 2) and the behavior link, as distinguished from 
the organization of numerous total links into a chain as a whole 
(Chapter 6). Thus the units of our present analysis at once become 
smaller than those considered heretofore. From dealing with the 
entire behavior link as a unit, we shall now be concerned with the 
fracdonal action phases of numerous distinct muscles that occur 
simultaneously with nicely graded intensity contractions and syn- 
chronizations so as to bring about a state of affairs which as a whole 
is reinforcing or not reinforcing. In this way the reinforcement 
operative within behavior links is all or none in nature. Also, our 
analysis will concern the phenomena of what may be said to be 
minute behavior, behavior which ultimately will become so slight 
in extent as to be quite unobservable by the present-day method- 
ologies of behavior investigation. At the same time the analysis 
itself will still be on a molar basis, as are all our analyses, in the sense 
that we shall not attempt an ultimate physiological interpretation. 
Accordingly we shall speak of this as a micro-molar approach, and 
in the following pages we shall adopt a manner of exposition rather 
different from that of the preceding chapters. 

Micro-molar Analysis of Contraction-Intensity Selection by the Ali-or-none 
Type of Reinforcement 

In order to save ourselves from becoming lost in a maze of exposi- 
tory details, it will be necessary for us to strip the theoretical situa- 
tion down to the barest essentials; by this device we should be 
able to take up the six progressively more complex theoretical cases 
which are to follow without inflicting on the reader undue difficulty 
of comprehension. Moreover the computational methodologies 
which we have used heretofore, on the coarser analyses, will now 
be ignored for the most part because the detailed outcome cannot 
be checked easily on such minute phenomena. 

CASE I. Let us assume, then, the joint action of only two muscles, 

A and B, occurring simultaneously over an instant of time, and 
followed immediately by reinforcement or non-reinforcement. 
Further let us assume that each muscle has available but two con- 
traction intensities. We shall number these contraction intensities 
I and II. We shall also assume that these contraction intensities 
initially are equally likely to occur. There arc thus in our simple 
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Simple trial-and-error learning, considered in Chapter 2, was de- 
scribed as a process whereby one response (R^.) of two fairly 
distinct ones evocable by the same stimulus combination is progres- 
sively strengthened, whereas another (R-_) is extinguished. At each 
trial only one act occurs, and it is immediately reinforced (R+) 
or not reinforced (R_). 


Again, in the learning of reaction chains by heterogeneous com- 
pound trial and error and terminal reinforcement, say, as presented 
in Chapter 6 (p. 165 ff), several behavioral links of the sort that 
are mentioned in the preceding paragraph often are involved in a 
c ain and all are reinforced at the end of the sequence by a single 
event (feeding). In the linear maze there considered a given true 
respome simply cannot be performed until that of the preceding 
in . as txn correctly performed, and at the entrance to each 
section of the maze a trial-and-error process of door selection must 
occur before the animal can continue its forward locomotion. This 
means that there is a checking of forward movement at every wrong 
response, and at le^t a secondary reinforcement (based on the 
mbsequent fonvard locomouon) at every correct response. From a 
tehavioral point of view we must conclude that apart from the 
differences between one act and the next it is this interruption 
by the occurrence of errors in progress toward the goal that sep- 
arates behavior chains into distinct links. This, it is believed, h 
as mtli t " “ <-l-and-error learning of behavior liliks 


In the present chapter we shaU consider die associadve organiza- 
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THEOREM 41. Other things equal, the all~or~none type of differ- 
ential reinjorcement of the joint outcome oj the simultaneous contrac- 
tion-intensity of each of several muscles involved in a simple behavior 
link will, within the limits of the normal oscillation range, result in 
the gradual elimination of the maladaptive phase combination and its 
gradual replacement by the adaptive phase combination. 

CASE II. We proceed next to the slightly more complex situation 
where everything is assumed to be the same as in Case I except that 
two sequential alternative contraction-intensity phases are involved 
in each of the two muscles. Adding Arabic numerals to indicate 
the order of the sequential contraction phases involved, we have: 



Correct 



combination 


First contraction phases: 

A I 1 

A II 1 


B I 1 

BII 1 

Second contraction phases: 

AI2 

A II 2 


B I 2 

BII 2 


Taking the various possible of the equally probable combinations 
yielded by chance, and assuming the same amounts of reinforce- 

TABLE 24. The various combinations of equally probable contraction intensities in 
Case 11, together with the resulting increment of reinforcement and inhibition for 
each evocation. 

Correct combination: 

A I 1 « +4.0 - .2; B I 1 « +4.0 - .2; A I 2 = +4.0 - .2; B I 2 ** +4.0 - .2 
Incorrect combinations: 

A I 1 « -.2; B 1 1 = -.2; A I 2 =• -.2; B II 2 = -.2 
A 1 1 =* -.2; B I 1 = -.2; A If 2 = -.2; B 1 2 = -.2 
A 1 1 « -.2; B 1 1 = -.2; A 11 2 = -.2; B 11 2 = -.2 

All* -.2; B II 1 = -.2; A I 2 “ -.2; B I 2 = -.2 
A I 1 * -.2; Bin* -.2; A I 2 * -.2; B II 2 * -.2 
A 1 1 * - 2; B II 1 = -.2; A II 2 = -.2; B I 2 = -.2 
A I 1 * -.2; B II 1 = -.2; A II 2 * -.2; B II 2 * -.2 

A U 1 * -.2; B I 1 = -.2; A I 2 - -.2; B I 2 * -.2 
A II 1 » -.2; B I 1 * -.2; A I 2 * —.2; B 11 2 - -.2 
A 11 1 * -.2; Bit* -.2; A II 2 * -JZ; B I 2 - -.2 

A II 1 * -.2; B I 1 * -.2; A II 2 * -JZl B II 2 - -.2 

A II 1 * -.2; Bin - -.2; A I 2 * -.2; B I 2 - -.2 

A II 1 » -.2; B II 1 “ -.2; A I 2 - -.2; B II 2 - -.2 

A II 1 * -.2; B II 1 - -.2; A II 2 - -.2; B I 2 - -.2 
A II 1 - -.2; B II 1 - -.2; A II 2 - -.2; B II 2 - -.2 
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theoretical situation the following contraction possibilities; 

Correct 

combination 

a 1 All 

BI BII 

Now let us assume that the contraction-intensities A I and B I 
will each be reinforced when they occur jointly, but that none of 
the other combinations will he. Finally let us assume that a single 
reinforcement will add 4 points to the habit strength of each muscle- 
contraction phase involved, and that each reaction evocation, 
whether correct or not, will add .2 of a unit of inhibition to each 
muscle-contraction phase involved. Accordingly we shall have the 
possible combinations shown in Tabic 23, together with the numer- 

T A B L E 23. The theoretical reinforcement and inhibitory combinations together with 
a summary of the net effect on each reaction intensity involved in one evocation of 
each combination possible within a single behavior link. 

Reinforced combination: A I ■ +4.0 — .2; B I — +4.0 — .2 

Non-relnforeed combination: A I • —.2; B II «■ — ,2 
Non-relnforeed combination: A II » —.2; B I —.2 
Non-rcinforced combination: A II « —.2; B II » —.2 
In summary of A I’l: 4.0 — .2 — .2 •• 4.0 — .4 - 3.6 
In summary of B I's: 4.0 — .2 — .2 w 4.0 — .4 » 3.6 
In summary of A Il’s: .2 — .2 • —.4 
In summary of B IPs: .2 — .2 —.4 

ical results of the consequent reinforcement or lack of reinforcement 
shown in the four summary lines. This means, of course, that the 
two correct reaction intensities on a single set of equally likely 
trials have both gained 3.6 net points despite one failure combina- 
tion of each, and that the incorrect phase of each muscle has lost 
.4 of a point from the two failures. Thus in this sample set of trials 
there is a net advantage of a correct phase, such as A I, over the 
competing incorrect phase. A II, since A I gains strength as A II 
loses. It is evident that while this seemingly indiscriminate rein- 
forcement or non-reinforcement alike of all contraction phases 
involved within a given behavior link on a given occasion is some- 
what different from the tiial-and-crror learning of behavior links 
as a whole, it is perfectly consistent with the gradual but ultimate 
selection of the correct contraction phase combination. 

Generalizing from the preceding considerations, we arrive at our 
next theorem: 
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combination of stimulation which forms part of a generalization 
continuum, the other portions of which may not be directly associ- 
ated with reinforcement. Now the portion of this continuum which 
is consistently associated with success (reinforcement) automatically 
acquires (Corollary ii) the power of secondary reinforcement {4, 
pp. 84 ff.). Moreover, the power of secondary reinforcement itself 
(4, pp. 183 ff.) presumably generalizes to other portions of the 
continuum according to the principle of stimulus generalization 
(X). Also it will be recalled that stimulus generalization operates 
as a negative growth function, the maximum point of the generali- 
zation being at that point of the stimulus continuum which is 
directly associated with reinforcement (5, pp. 18 ff.). 

It thus comes about that acts which on certain trials fail of 
primary reinforcement of the all-or-none variety but approach 
more or less closely to the conditions necessary for such reinforce- 
ment, will receive secondary reinforcement as an increasing function 
of the approximation to the conditions necessary to primary rein- 
forcement. Accordingly the joint behavioral outcome of the several 
muscular contraction-intensity phases which occur in any act will 
both yield and receive a functionally graded amount of (secondary) 
reinforcement. For example, in archery practice if the arrow hits 
the edge of the target more success is indicated than if it does not 
strike the target at all, and the smaller the ring it enters, the greater 
is the success, the very center of the target indicating the greatest 
success of all and so generating the greatest reinforcement. Sim- 
ilarly, the more pins a bowler knocks down with his ball, the greater 
will be the reinforcement of his act; the louder the laughter at the 
telling of a joke, the greater will be the reinforcement received by 
the comedian; the closer the approximation of a letter to the form 
of the copy, the greater will be the reinforcement to him who is 
learning to write; the more words typed per minute by the com- 
mercial student, the greater will be her reinforcement; the shorter 
the time required to run one hundred yards, the greater will be 
the reinforcement to the sprinter; the more rapidly the pile of work 
pieces increases, the greater will be the reinforcement to the piece 
worker; and so we could go on endlessly. All of these reinforcements, 
be it noted, are secondary in nature. Reinforcement by gradation 
according to the approach of the reaction to perfection will be 
called correlated reinforcement intensity. 
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ment and inhibition at each reaction evocation as in Case I, we 
have the results shown in Table 24. 

If, now, we cast up the aggregate reinforcements and inhibitions 
of the above single set of equally probable correct and incorrect 
contraction-intensity phases involved (as shown in Table 24), we 
have Table 25. 

T A B L E 25. Summary of the theoretical net reinforcement and inhibitory results of 
one complete set of equally probable reaction evocation combinations on the eight 
possible contraction phases. 

A I 1 = 4 - .2 - .2 - .2 - .2 - .2 - .2 — .2 - .2 = 4.0 - 1.6 = 2.4 

B I 1 = 4 - .2 - .2 - .2 - .2 - .2 ~ .2 - .2 — .2 = 4.0 - 1.6 = 2.4 

A I 2 = 4 - .2 - .2 - .2 - .2 - .2 - .2 - .2 - .2 = 4.0 - 1.6 = 2.4 

B I 2 = 4 - .2 - .2 - .2 - .2 - .2 - .2 - ,2 - .2 = 4.0 - 1.6 = 2.4 

A n 1 = -.2 - .2 - .2 - .2 - .2 - .2 - .2 - .2 = - 1.6 

B n 1 - -.2 - .2 - .2 - .2 - .2 - .2 - .2 - .2 - - 1.6 

A II 2 - -.2 - .2 - .2 - .2 - .2 - .2 - .2 - .2 « - 1.6 

B II 2 - -.2 - .2 - .2 - .2 - .2 - .2 - .2 - .2 - - 1.6 

Thus we see that the four correct contraction intensities all 
show a net gain of 2.4 points, whereas the incorrect contraction- 
intensity phases all lose 1.6 points. This makes a relative gain for 
the correct one of each pair of competing contraction intensities, 
which indicates that whenever correct and incorrect contraction- 
intensity phases occur anywhere in the combination making up a 
e avior in by the all-or-none method of reinforcement an 
effective selection of the correct from the incorrect contraction- 
intensity phases is quite possible. 

Generalizing on the preceding considerations, we arrive at our 
next theorem; 


‘y»‘ "/ mnforciment of simple 
Z ^ “//““'O’ eitermtive successive conlruction-intensily 

tZiiZ ofudvptive cvntractici 

mlenniy phases wilhm a behavior link will gradually occur. 

It may be added that thU wHnr. Uo. 
imnort-^nf f u u • ^** 0 ^ has given an account of an 

important aspect of behavior commonly called skill. 

Uafning Based oa Correlated Re!„forcem„, Intensities 
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having a different degree of reinforcement. We shall further assume, 
in harmony with the principle of correlated reinforcement, that 
A I and B I contribute to the joint reinforcing state of affairs as a 
whole one point each; that A 11 and B II contribute 2 points each; 
and that A III and B III contribute 3 points each. Using the same 
notation as before, we have the following contraction-intensity 
possibilities with parallel reinjorcements: 

All BI 1 

A II 2 B II 2 

A III 3 Bins 

This yields in the various possible combinations the amounts of 
joint reaction potential and of extinction effects listed in Table 26. 
An inspection of that set of summated values shows that the out- 
come of the joint contraction-intensity combination yields a graded 
set of net reinforcement results which is nicely correlated with the 
reinforcement differences. By sorting out the three reinforcement 
values of each response intensity, and averaging, we find that: 

A I and B I each averages a reinforcement intensity of 3; 

A II and B II each averages a reinforcement intensity of 4; 

A III and B III each averages a reinforcement intensity of 5. 
This insures, as practice continues, a progressive dominance, i.e., 
a progressive increase in the evocation of the more strongly rein- 
forced contraction-intensity phases A III and B III as contrasted 
with the other two contraction intensities, especially A I and B I, 
even though both contractions of each combination receive equal 
reinforcement at any given evocation. 

Generalizing from the above considerations, we arrive at our 
next theorem: 

THEOREM 44. Other things equals the correlated reinforcement of 
simple variable acts is favorable to the selection of response intensities 
which are more strongly reinforcing rather than of those which are less 
strongly reinforcing. 

Micro-Molar Analysis of Response-Infensily Generalizotlon 

CASE IV. At this point we must notice explicitly the entrance of 
the principle of response generalization (4, pp. 316—319). In a 
completely logical presentation this would, except for expository 
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Generalizing from the preceding considerations, we formulate 
our next theorem: 

THEOREM 43. When a reaction evocation has been reinforced one 
or more times in the presence of some phase of a stimulus continuum, 
subsequent evocations, whether the latter are maximally correct or not, 
will tend to receive graded secondary correlated reinforcement by gen- 
eralization from other phases of this stimulus continuum. 


We are now ready to proceed to the consideration of Case III. 
CASE in. At the end of the preceding section involving the learn- 
ing of simple motor coordinations Vkithin a behavior link, we con- 
sidered only two degrees of contraction intensity at any contraction 
phase of a given muscle. We must now recall, in the interest of 
realism, that according to the principle of response oscillation 
(bOr) there arc an infinite number of gradations in the possibility 

TABL* 26. The amounts of reinforcement and inhibition resulting from a single 
respome involving each of the different response-intensity combinations in a theo- 
“tuation of the all-or-none type of correlated reinforcement 


A I and B I: 

A I and B 11: 

A I and D III: 

A II and B I: 

A II and B II: 

A II and B tH: 
A III and B I: 

A III and B II: 
A III andB III: 


1+1-2 uniu of reinf. and .2 unit each of Ir 
1+2-3 units of reinf. and .2 unit each of Ir 
1+3-4 units of reinf. and .2 unit each of Ir 
2 + 1-3 units of reinf. and .2 unit each of Ir 
2+2-4 units of reinf. and .2 unit each of Ir 

2 + 3-5 units of reinf. and .2 unit each of Ir 

, I i " ^ “"‘I -2 unit each of Ir 

3+2-5 uniu of reinf. and .2 unit each of Ir 

3 + 3-6 units of reinf. and .2 unit each of Ir 


coXactton°"f '"stant. These 

Ihe total distribuL evenly over 

apn^xlatefv’ " P-^>™ably are distributed 

appraxunately according to the normal law of chLce U p. 319). 

the »cie'nce 

a linelc KaaiioIpW ’T'T “ 

simultaneously, each muscle havino^ rA • ^ ^nd B, acting 

I, n, and III; and each of the dema n? ■mens.ues of contraction: 

cgrccs of joint contraction intensity 
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T A B L E 27. The various combinations of equally probable contraction intensities of 
two muscles, in one of which (A) the reinforcement gradient continues to rise as in 
Table 26, but in the other of which (B) the gradient ceases to rise with further shift 
in reaction intensity. The table shows the increments of reinforcement and extinction 
resulting from each combination. 

A I and B I: 2 units of reinforcement and .2 unit each of Ir 

A I and B II: 3 units of reinforcement and .2 unit each of Ir 
A I and B III: 2 units of reinforcement and .2 unit each of Ir 

A II and B I; 3 unite of reinforcement and .2 unit each of Ir 
A II and B II: 4 units of reinforcement and .2 unit each of Ir 
A II and B III: 3 units of reinforcement and .2 unit each of Ir 

A III and B I: 4 units of reinforcement and .2 unit each of Ir 
A III and B II: 5 units of reinforcement and .2 unit each of Ir 
A III and B III: 4 units of reinforcement and .2 unit each of Ir 

Sorting out the total reinforcements and inhibitions of all the A 
combinations which contain A I, A II, A III, B I, B II, and B III, 
we have the results given in Table 28. 

T A B L B 28. A summation of the detailed reinforcements and inhibition increments as 
presented in Table 27. 

A I - 2 - .2 + 3 - .2 + 2 - .2 - 7 - .6 = 6.4 
All = 3 - .2 + 4 - .2 +3 - .2 - 10 - .6 - 9.4 
A III » 4 - .2 + 5 - .2 + 4 - .2 - 13 - .6 » 12.4 

B I «= 2 - .2 + 3 - .2 + 4 - .2 « 9 - .6 = 8.4 

B II = 3 - .2 + 4 - .2 + 5 - .2 = 12 - .6 « 11.4 

B III ■= 2 - .2 + 3 - .2 + 4 - .2 - 9 - .6 « 8.4 

An examination of these combined incremental and inhibitory 
results shows that, quite as one would expect intuitively, contrac- 
tion-intensity phase A III has emerged as dominant over both A II 
and A I, with which it is in competiUon, whereas B II has emerged 
as dominant over both B I and B III, with which it is in competi- 
tion. In, the case of the first muscle the increased reinforcement has 
led to a further shift in reaction intensity from that at the first 
reaction evocation, but in the case of the second muscle the change 
{decrease) in the amount of joint reinforcement beyond B II has 
led to the stabilization of the contraction intensity which yields 
the optimal amount of net reinforcement. It is to be expected 
that sooner or later muscle A will reach a contraction intensity .such 
that its advance oscillatory generalization will decline, as has licen 
assumed in effect to be the situation in the case of muscle B. At 
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difficulties, have been introduced as involved in Case III. The 
principle of response generalization states in effect that every habit 
increment of response intensity oscillates more or less symmetrically 
(</, pp 304 ff.) about a central response intensity. This means that 
as the habit strengths of A III and B III just cited grow strong, they 
will begin to generalize and therefore to vary about this new center 
of oscil ation (xiii). This generalization will create a new group of 
evokable reaction-intensity phases. A IV and B IV. But as soon as 

firs, Also, as a result the 

fim group of reinforcements A I and B I will gradually weaken 
relatively, and possibly drop out of the competition 

A IvTd B IV reinforcement is continued, 

ia^that in th ^“■dorcement of 6 units, it i 

A 11 2 B II 2 
A III 3 Bill 3 
A IV 4 B IV 4 

- "e^r r - - - 

uffy bi shifudL will gradu- 

<u long as the reinforcement ' '‘'iforcemenl conlinuam 

^ ^ remains relatively constant. 

the acquisition csl precise 7’" "''=':>'anism responsible for 

-ase v. But what may Z —7^ 
reinforcement throughout a hel,,.!; .■ ? “™'' simultaneous 
to respond as before but B in now 7 continue 

A’s than before so that the ioint nr,-'"" i!"” efficiently with the 
as B I? This means that so far as ',7 reinforcement 

contribution of B II is now at its • concerned the 

contribution slopes off in both dire?7""““’ "“"d from there the 
that case, neglecting previous ta^r^ ‘t ® ”1- I" 

ment increments shown in Table 27 reinforce- 



THE INDIVIDUAL BEHAVIOR LINK 


203 


Taking the various contraction-intensity combinations on this basis, 
we have Table 29. Summarizing these effects, we have Table 30. 

An examination of these values shows that in the case of both 
muscle A and muscle B the series of reaction evocations have 
resulted in a definite net advantage in favor of contraction phase I, 
which involved the least amount of work and which therefore 
generated the least amount of inhibition for both muscles. 

T A p L E 29. The reaction and inlubitory potential increments generated by one response 
evoked by each combination of the conditions of Case VI where reinforcement is 
constant but the amount of work (W) (and so the amount of Ir generated at reaction 
evocation) Vcudca (Case VI). 

AlandBI: AI»2-.2andBI=2~.2 

A I and B II: A I = 2 - .2 and B II « 2 - .4 
A I and B III: A I = 2 - .2 and B III = 2 - .8 

A n and B I: A II « 2 - .4 and B I = 2 - .2 
A n anej B 11: A II « 2 - .4 and B 11 = 2 - .4 
A n and B III: A 11 - 2 - .4 and B III = 2 - .8 

A III and B I: A III - 2 - .8 and B I - 2 - .2 
A III and B II: A III «= 2 - .8 and B II * 2 - .4 
A III and B III: A III - 2 - .8 and B III = 2 - .8 

T A B LE 30. A summation of the detailed learning and inhibition increments presented 
in Table 29. 

A I = 2 - .2 + 2 - .2 + 2 - .2 = 6 - .6 - 5.4 

A n = 2 - .4 + 2 - .4 + 2 - .4 = 6 - 1.2 - 4.8 

A III = 2 - .8 + 2 - .8 + 2 - .8 = 6 - 2.4 = 3.6 

B I « 2 - .2 + 2 - .2 + 2 - .2 = 6 - .6 - 5.4 

Bir « 2 -.4 -f- 2 -.4+2 -.4 - 6 - 1.2 -4.8 
Bin « 2 - .8 + 2 - .8 +2 - ,8 - 6 - 2.4 -‘S.B 

But as contraction phase I becomes dominant and phase III 
is partially extinguished, the center of oscillation will shift from 
HI to 11, with the result that oscillation or response-intensity 
generalization (4, pp. 316, 319) will spread to the next weaker con- 
traction phase (I), which involves a still smaller amount of work. 
This progression will obviously go forward until a point is reached 
at which the original positive reinforcement begins to diminish 
cither in amount or in probability of occurrence, or in ^th. At 
that point the migration of contraction-intensity phases will begin 
to stabilize itself. Final stabilization will occur at the point at 
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that point its progressive shift in reaction intensity may be expected- 
to become stabilized. Both muscles will then have become as fully 
coordinated as possible. 

Generalizing from the above considerations, we arrive at our 
next theorem: 


THEOREM 46. Other things equal, each muscle in a group involved 
in an act which permits of varying amounts of reinforcement according 
to the net effect of the joint activity, will gradually shift its individual 
contraction intensity in the direction of that intensity which when joined 
wi th the contraction intensities of the other muscles will yield a maximum 
of reinforcement, and will there become stabilized. 
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Taking the various contraction-intensity combinations on this basis, 
we have Table 29. Summarizing these effects, we have Table 30. 

An examination of these values shows that in the case of both 
muscle A and muscle B the series of reaction evocations have 
resulted in a definite net advantage in favor of contraction phase I, 
which involved the least amount of work and which therefore 
generated the least amount of inhibition for both muscles. 

TABLE 29. The reaction and inhibitory potential increments generated by one response 
evoked by each combination of the conditions of Case VI where reinforcement is 
constant but the amount of work (W) (and so the amount of Ir generated at reaction 
evocation) varies (Case VI), 

A I and B I: A I = 2 - .2 and B I « 2 - .2 

A I and B II: A I » 2 - .2 and B II = 2 - .4 

A I and Bill: A I = 2 - .2 and B III = 2 - .8 

A n and B I: A H » 2 - .4 and B I * 2 - .2 
A n an^ B II: A II « 2 - .4 and B II = 2 - .4 
A II and B III: A II 2 - .4 and B III « 2 - .8 

A III and B I: A III » 2 - .8 and B I - 2 - .2 
A III and B II: A III - 2 - .8 and B II = 2 - .4 
A III and B III: A III « 2 - .8 and B III - 2 - .8 

T A B LE 30. A summation of the detailed learning and inhibition incremenu presented 
in Table 29, 

A I - 2 - .2 + 2 - .2 + 2 - .2 - 6 - .6 - 5,4 
AH = 2 - .4 + 2 - .4 + 2 - .4 = 6 - 1.2 - 4.8 

A III - 2 - .8 + 2 - .8 + 2 - .8 - 6 - 2.4 - 3.6 

B I - 2 - .2 + 2 - .2 + 2 - .2 - 6 - .6 - 5.4 
B n - 2 - .4 + 2 - .4 + 2 - .4 - 6 - 1.2 - 4.8 

Bin - 2 - .8 + 2 - .8 + 2 - .8 - 6 - 2.4 - 3.6 

But as contraction phase I becomes dominant and phase III 
is partially extinguished, the center of oscillation will shift from 
III to II, with the result that oscillation or response-intensity 
generalization ( 4 , pp. 316, 319) will spread to the next weaker eon- 
traction phase (I), which involves a still smaller amount of work. 
This progression will obviously go fonvard until a point is reached 
at which the original positive reinforcement begins to dimmish 
cither in amount or in probability of occurrence, or m both. At 
that point the migration of contraction-intensity phases will begin 
to stabilize itself. Final stabilization will occur at the point at 
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which the maximum net reaction potential (bEr) is attained. The 
locus of this point will be determined, of course, jointly by (1) the 
slope of the amount of inhibition as a function of work (W), and 
(2) the falling of the reaction potential as a function of the reduced 
muscular contraction. We do not know enough about the param- 
eters involved to make such an attempt at the present time. 

Generalizing on the basis of the above considerations, we arrive 
at our next theorem: 


THEOREM 47. Other factors being equal, the various contraction- 
intensity phases of every muscle involved in the performance of a simple 
behavicr link will gradually shijl until they involve less work, eventu- 
ally reaching a minimum where they will stabilize. 


At this point we may consider thejwVit effect of varying amounts 
of reinforcement on the one hand (Theorem 45) and of varying 
amounts of work m performing an act (Theorem 46) on the other, 
each originally treated separately. This joint coordination represents 
ablets™'"” ^'* 2 P“''e efficiency and the maximum of attain- 
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of 26 grams, say, food \vill be delivered and the reaction will be 
reinforced. This increment in reaction intensity will generalize on 
the basis of the proprioceptive stimulus intensity and the oscillation 
function. 

As practice continues, some of the response intensities will fall 
below the 20-gram limit and will begin to suffer extinction. This 
will not only attenuate still further the weak response itself, but 
through the stimulus-intensity generalization of inhibition it will 
cause adjacent stronger reaction tendencies to lose strength even 
at intensities above the limit of reinforcement. This in turn will 
remove the competition from the low level, permitting responses 
from the higher levels to be evoked. These responses will, of course, 
be reinforced, which will still further strengthen the tendencies at 
the higher levels and also increase the generalized reaction potentials 
at the levels consistently extinguished. As a result a small number of 
responses below the lower reinforcement level will continue to 
occur. 

Generalizing on the above considerations, we arrive at our next 
two theorems: 

THEOREM 49. Where response intensities are given reinforcements 
restricted at the lower limit only^ the frequency of the responses below 
the limit will gradually diminish, but a Jew responses below this level 
will continue to occur. 

THEOREM 50. If the lower limit of restricted reinforcement is raised, 
the whole distribution of reaction intensities will be raised, many 
responses now occurring which have never previously been reinforced. 

Suppose, now, that we impose a second or upper restriction on the 
reaction intensities which will be reinforced, such that^ the new 
restriction falls appreciably below the level of response intensities 
made under the lower restriction when acting alone. A case in 
point would be to impose an above-30-gram restriction on a set 
of responses previously set up under a 20-gram lower limit, t is 
at once evident that all those responses falling above the 30-gram 
limit will tend to be extinguished, and that this inhibition wi , y 
the principle of stimulus generalization, generalize especially upon 
the upper portion of the reaction intensities within the range really 
reinforced. This will gradually reduce the frequency of responses 
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not only above the upper reinforcement limit but also in the upper 
portion of the intensities really reinforced.' 

Generalizing from these considerations, we arrive at our next 
theorem: 


THEOREM 51. When an upper restrictive reinforcement limit is 
placed on a reaction intensity distribution set up under a lower restrictive 
limit, and the upper limit falls appreciably below the range already 
occurring, (a) the distribution will be narrowed, (b) its central tendency 
will shift downward, and (c) it will present a larger range below the 
lower limit of reinforcement than it did under the lower limit alone. 


At this point we must recall a principle already employed. This 
IS to the effect that where we have a work gradient, as here, there 
will be a greater amount of extinctive inhibition generating from 
each response above the upper reinforcement limit (because of 
^ fra® each one hhu, the lower reinforcement 

the Sacralization of this greater amount of inhibition above 

“nWb tio^ber generalization of the 

inhibition below the lower response range 

theor“hhsrs:‘^“' considerations; we arrive at our final 


"""'ZtT ''' '■raradlag Micro-.olor 

of simple adjustmcnTlMmln^iSin beh*“*’'r"u^‘''' 

far the preceding theoretical US see how 

ical evidence. the Tu^t of I"™:” 

as yet there has been no direct exDerhn “"‘'cssed that 

theory regarding the elimination of fauhl ” f “f 'he 

phases of individual muscles. Presmnablv I '? "'“y'°Pcontraction 



THE INDIVIDUAL BEHAVIOR LINK 


207 


muscle on an animal’s leg, and then recording the movement 
intensities of the part of the leg primarily involved in the activation 
of this one muscle. After the wounds incidental to the operation had 
healed and the animal had become habituated to the experimental 
conditions, a recording dynamometer could be attached to the 
moving member and the animal when hungry would be reinforced 
for such intensities of contraction as the experiment would require, 
but not for others. Until some such experiment is performed we can 
only make inferences from analogical studies involving the normal 
joint action of numerous muscles as observed in intact organisms. 



y 1 o u R E 48. Graph showing mean per cent of failure of twelve or more infants of differ- 
ent ages to reach a red one-inch cube placed on a plane wooden surface in front of them. 
Adapted from Halverson { 3 , p. 161). 

Fortunately, several fairly pertinent investigations of the latter 
type are now available. 

Empirical verification of Theorems 46, 47, and 48 is furnished 
by the fact that the initial awkward and angular movements made 
while an act is being learned gradually become linear where rein- 
forcement conditions permit, and tend to follow smooth curv’cs 
where changes of direction arc required. This is because sharp 
changes in direction or other sudden stops and starts in movement 
require work to overcome the momentum in deceleration and the 
inertia in subsequent acceleration. An illustration of this at a very* 
primitive level is reported in a meticulous study by Halverson con- 
cerning the acquisition by infants of the power to reach and grasp. 
The results from one part of this im'csligation arc summarized as 
folIo^^•s (5, p. 273): 
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Three forms of [reaching] approach appear: the backhand 
sweep; the circuitous, which includes, besides the angular and 
scooping sweeps, the less circuitous reaching; and the direct 
(straight) approaches. Infants from 16 weeks to 28 weeks of 
age employ either the backhand approach, or the very circui- 
tous approach in reaching. Infants of 32 and 36 weeks use a 
less circuitous form of approach in reaching for the cube and 
infants of 40 and 52 weeks usually employ the direct approach. 
Similarly, the backhand and circuitous abproaches straighten out into 
the direct approach. [Italics ours.] 

And again {3, p. 274): 


From 16 weeks to 24 weeks, infants often raise the hand, 
thrust It forward circuitously, and lower it in a manner which 
approach consists of three individual acts. 
40 weeks no trace of these separate acts is discernible; they are 
incorporated into one fluent reaching movement. [Italics ours.] 
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FIGURE 49. Three-dimensional models of the path of the left hand of a man re-leam- 
ing to operate a drill press, at four different stages of training, the final stage being at 
the right. Unfortunately for the evidential value of this illustration, the learning repre- 
sented occurred under specific instructions to eliminate waste motions. Its relevance 
here lies in the theoretical expectation that uninstructed practice would tend in time 
spontaneously to produce much the same kinds of movement simplification, though not 
so quickly and not so markedly as under the GiJbreth type of instruction. Reproduced 
from Gilbreth and Gilbrcth (2, pp. SK)-91). 
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reproduced as Figure 49. A study of these models shows a steady 
and marked simplification, shortening, and smoothing of the seg- 
ments making up the action cycle from an early stage of practice, 
at the left, to a late stage, at the right. The success or precision of 
skilled action in industry is reflected in the usual increase in pay- 
ment with length of training. 

A second series of studies of a quite different type was begun by 
Hays and Woodbury who used an apparatus which was essentially 
a Skinner box with recording dynamometer. Their study shows 
(4, p. 305) the distribution of bar-pressure intensities of an albino 
rat. The mechanism was so set that all pressures above a 21 -gram 
minimum were reinforced by a small cylinder of specially prepared 
food, and pressures below 21 grams were not reinforced. After 
several hundred trials of this restricted reinforcement had been 
given, a distribution of the responses showed (1) that most of the 
reaction intensities exceeded the 21*gram minimum, some of them 
by as much as 20 grams, and (2) that the distribution was approxi- 
mately symmetrical. Thus Theorem 49 finds empirical verification. 

We now approach a more complex problem. Hays and Wood- 
bury shifted the critical reaction intensity from 21 to 38 grams. 
This change caused the distribution as a whole to move in the 
direction of greater reaction intensity, nearly half of the reactions 
under the new conditions exceeding the maximum reaction ob- 
tained under the first conditions. This is the main point of the 
empirical illustration: differential reinforcement of certain contrac- 
tion intensities of a variable response causes the distribution to 
shift away from the unreinforced reaction intensities in the direc- 
tion of the reinforced ones (4, p. 305). Thus Theorem 50 also finds 
empirical verification. 

An experiment wWch considerably extended the Hays-Wood- 
bury study was performed by Arnold (7). The apparatus used was 
the same as in the former study except that after the animals had 
learned to obtain food by receiving reinforcement only when their 
pressure exceeded 30 grams, it was modified in such a manner 
as to yield food only when pressures were made within an arbitrary 
range falling between 30 grams and 40 grams. The distributions 
of the reaction intensities of three typical animals at the last hundred 
of 800 trials are shown in the upper portion of Figure 50. Here we 
see that most of the reactions fall considerably above the minimum 
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marked by the broken vertical line. The lower portion of this 
figure shows the distribution of pressures on the last hundred of 
300 trials after the upper limit was imposed. It may be seen in this 
graph that following the introduction of the upper limit there is a 
marked reduction in the number of strong-intensity reactions. 
Secondly, there is a shift of the distribution as a whole, somewhat 
in the weak direction. Thirdly, there appears to be a net narrowing 
of the amount of variability. In a word, these three facts furnish 
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in a favorable direction, we have a striking experiment performed 
a few years ago by Skinner (6; 7). He trained an ordinary albino 
rat to lift a rather heavy steel ball in its paws and drop it into a 
tube which projected approximately a centimeter above the floor 
of the apparatus. The falling ball made an electric contact lower 
in the tube, which caused a magnetic food-vending device to 
deliver a pellet of food to the animal, thereby reinforcing the act. 
On the basis of the maxim that “an act must first occur before it 
can be reinforced,” such an achievement in training would be 
impossible because the acts which occurred late in the training 
did not occur at all at its beginning. However, the technique 
employed by Skinner when taken in conjunction with the principles 
elaborated in this chapter make the feat perfectly intelligible. He 
first induced the rat to roll the ball a little in any direction what- 
ever, giving food reinforcement after each response. Later, when- 
ever this act varied in a favorable direction, e.g., when the ball was 
rolled toward the tube, it was reinforced, but it was not reinforced 
when the ball was rolled in any other direction. At the beginning 
the tube was lowered so that it represented only a hole in the floor. 
Thus the rat had only to roll the ball to the hole, and as it fell in 
the act was complete. The last and critical stage was to raise the 
tube ever so slightly above the floor of the apparatus. When the 
slight variations of preceding behavior necessary to overcome this 
obstacle were fixed by trial and error and differential reinforcement, 
the tube was raised slightly again, and some of the small variadons 
of the motor coordination previously formed were sufficient to over- 
come the new obstacle. As practice was continued the tube was 
progressively raised and the rat’s behavior gradually adapted to it 
until at the end of the training the animal was lifting the ball a full 
centimeter. 

The exceedingly gradual progress of human skills and inventions, 
when viewed in historical perspective, rather suggests that a 
mechanism somewhat similar to that described above may be in- 
volved in addition to the advantage which the possession of lan- 
guage undoubtedly gives to man. In the latter respect men differ 
from rats in their ability under favorable circumstances to advance 
by larger steps in the direction of behavior novelty. The reason 
for the fact that the higher forms of non-speaking organisms possess 
greater power to acquire complex skills and coordinations than 
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do the lower forms, probably lies mainly in their greater capacity 
for differential secondary correlated reinforcement (^, pp. 84 ff.); 
and this presumably arises from a greater capacity for differentiate 
ing (discriminating) more precisely the movements which lead 
more closely or less closely to states of affairs uniformly associated 
with primary reinforcement. 


Summary 

Ordinary behavior analysis is based on the reaction chain as a 
whole where success at a link is reinforced (secondarily) by progress 
toward a point of primary reinforcement or goal, and where errors 
at once produce a frustrating interruption in progress toward the 
goal. Such coarse divisions of behavior arc not available for the 
selective process within the individual behavior link where the 
occurrence of an erroneous response does not cause a behavior 
interruption before the end of the link. At that time all reaction 
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the joint action of all. This is the so-called all-or-none type of 
reinforcement or extinction. There seems to be no separate trial-and- 
error learning for the muscular contractions within a behavior link. 

An analysis of a pair of simple examples brings us to the con- 
clusion that the all-or-none type of differential terminal reinforce- 
ment is capable of the effective selection of the more adaptive 
contraction-intensity phases from the less adaptive contraction- 
intensity phases of each muscle at each stage of an act. 

A further analysis of the learning of simple acts or skills leads to 
the view that reinforcement is often not only all-or-none, or pri- 
mary, but is secondary in nature; it is correlated or graded accord- 
ing to the nature of the joint outcome of the act. On the basis of 
the working out of an example we conclude that adaptive variations 
arising in accordance with the principle of behavioral oscillation 
(bOh) can be selected from less adaptive ones quite effectively by 
correlated secondary reinforcement. 

But granted that a sequence of parallel contraction-intensity 
phases which will approach a maximum correlated reinforcement 
can be selected, there remains another dimension in the reaction 
picture — that of the economy of energy consumption. By means of a 
simple example it is shown how those oscillatory variations which 
chance to reduce the work factor will, other things equal, gradually 
lead to more rapid and also to less fatiguing performances of 
uncomplicated repetitive acts. This is in conformity with the molar 
law of less work {4, p. 293). 

In a still more minute examination of the acquisition of the 
motor coordination of behavior links and skill, the process of 
response generalization has been analyzed. It was found that the 
oscillation factor superposed upon the generalization on the stimu- 
lus dimension produces the phenomenon of response intensity general- 
ization to which the response is attached (4, p. 316). From this it 
follows that differential reinforcement above a critical intensity of 
response will push the whole distribution of stimulus intensities 
upward; that putting on an upper limit will push the distribution 
of stimulus intensities downward; that a double (upper and lower) 
restrictive reinforcement limit will narrow the range of reaction 
intensity; that in such a case the reactions beyond the lower limit 
will extend farther than those beyond the upper limit; and that 
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the mode of the response intensity will fall closer to the lower 
limit than to the upper limit. All of these latter theoretical deduc- 
tions are supported by empirical observations. 

And finally it may be pointed out that the principles of behavior 
oscillation and correlated reinforcement as stated in the preceding 
paragraph have yielded an understanding of how needed novel acts 
never previously perjormed may come into existence so that their reinforce- 
ment may occur in the conventional manner, a problem that has 
greatly disturbed some theorists. 
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8. Behavior in Relation to Objects in Space 


All behavior must necessarily occur in space. To be adaptive, how- 
ever, much behavior, though by no means all, must take place in 
certain relationships to one or more specific objects in space. Be- 
havior in relation to objects and points in space has definite char- 
acteristics. Except in the recent past, students of behavior have 
for the most part not explicitly recognized approach and avoidance 
behavior as a division of psychology requiring special and distinc- 
tive treatment. In the present work we ourselves have so far avoided 
the explicit consideration of this important phase of behavior 
theory. Now, however, we have reached a point in our exposition 
at which we can give it the somewhat detailed consideration which 
its importance and complexity require. 

At first glance it may seem that behavior toward objects in 
space involves no special problems beyond those encountered in 
any other phase of behavior. To an anthropomorphic psychology 
the reaction to objects in space presents no special problems because 
the actual situations present no personal problems to normal 
humans. We are prone, therefore, naively to pass such situations 
by without raising the theoretical question of how nonorientational 
behavior differs from those forms involving the reaction directly 
to objects in space. At the very outset of the present chapter we 
must divest ourselves of this natural but fatal complacency regard- • 
ing approach and avoidance phenomena. 

Preliminary Qualitative Theoreticof Analysis of Adience and Ablence 
Let us suppose that an organism is in a state of need (So) caused 
by its being subject to a temperature below the optimum, and that 
215 
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a short series of random locomotor movements will lead it to a 
region in which the temperature is such as to reduce the So- 
Through the principles of behavior chaining and compound trial- 
and-error learning (Chapter 6) the drive stimulus reduction will 
result in a reinforcement of the response which preceded it, especi- 
ally the final segments of that response, to the stimuli which were 
acting while the behavior took place. Approach behavior of this 
kind we shall call adience or adient behavior^ and the object approached 
will be called the adient object 

Next let us suppose that an organism is in close proximity to a 
heating unit of high temperature; that as a result of this proximity 
the organism has an Sd caused by its being subject to a temperature 
appreciably above the optimum; and that a random set of loco- 
motor movements will lead to a withdrawal from the superheated 
object, which is followed at once by a reduction in the So- This 
drive reduction will result in a reinforcement of the avoidance 
behavior, whatever its nature, to the stimuli which accompanied it, 
especially the stimuli which accompanied the maximum reduction 
in the drive. Withdrawal behavior of this kind we shall call abience 
or abient behavior^ and the object from which withdrawal occurs 
will be called the abient object. 

Up to the present time we have tacitly assumed that organisms 
automatically receive stimuli of various kinds, and that theoretical 
problems are concerned only with adaptive response. At this point 
it must be noted that not all stimulus reception is automatic; that 
some receptor adjustments are almost always necessary to enable 
the organism to receive the stimuli optimally, or even at all. For 
example, in order for an organism such as a rat to learn the size 
of a newly found hole, it must bring its vibrissae into contact with 
the hole’s margin; for an organism to discover the temperature of a 
heating unit, it must approach close enough for its skin to feel the 
heat; to hear faint sounds, the organism must turn its better ear 
toward their origin; to identify an odor by its smell, the organism 
must sniff the air; and to see an object, the organism must open its 
eyes and direct its eyeballs toward the object so that the image will 
fall on corresponding points of the retina. 

Now this receptor adjustment for optimal stimulation requires 
certain muscular activity which must be based initially on the 
automatic stimulus reception. This implies that the receptor ad- 
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justment is itself based on a general habit formation which 
precedes ordinary instrumental habit action. In the chapters on 
chaining (6) and behavior-link acquisitions (7) we have seen how 
this type of learning takes place. 

Intimately connected with receptor adjustment is the matter not 
only of stimulus reception, but of perception. The specific question of 
space perception, for example, especially concerns us here. As we 
shall see, this very frequently depends on stimulus intensity. Other 
things equal, the more intense the vibrissae stimulation becomes, 
the shorter will be the distance to the redolent object; the more 
intense a radiant heat becomes, the shorter will be the distance 
to the hot object; the louder a sound becomes, the closer will be 
the sounding object. In the case of an object seen by the eye, the 
larger the image on the retina becomes, the closer will be the object 
and the more the two fixating eyes will converge; i.e., the greater 
the tension on the internal recti becomes, the closer will be the 
object. 

How does the animal acquire a knowledge of these space rela- 
tionships? A great deal of light has been thrown on this subject, at 
least so far as higher organisms are concerned, by Riesen’s classical 
study of chimpanzees which lived in darkness from birth until the 
age of sixteen months (75). With these animals, apparently, space 
perception is learned, and the learning is acquired rather slowly 
through an indefinitely large amount of trial and error in which 
the complex stimuli of visual space are closely associated with 
manualmotor and locomotor space movements. For example, as 
an object in the hand is brought toward the eye its retinal image 
grows larger and the convergence of the optical fixation becomes 
greater; and the same thing occurs as the organism walks toward 
an object, though in this case the optical image of the whole sur- 
rounding landscape grows larger. Here we have a motor sense of 
space being associated directly with the corresponding visual cues. 
Riesen’s study strongly suggests that in higher organisms these 
space cues normally receive an immense amount of reinforced 
practice during the first weeks of life. Lower organisms, however, 
require far less practice. 

The most important characteristic common to adient and abient 
behavior is perhaps the extent to which they generalize, i.e., the 
extent to which what in some sense appears to be a new act may 
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occur without specific practice. The pronounced generalization 
characteristics of orientational behavior arise from two major 
factors. The first of these is that behavior involving movement to 
any appreciable distance in space, either directly toward or away 
from objects, is largely locomotor in nature. In this connection it 
should be observed that locomotion is a highly generalized form oj be- 
havioT, since walking as suck to one point in space does not differ from walking 
to any other point in space; an organism that has learned to walk in 


unobstructed space to a point ten feet to the north needs no addi- 
tional skill so far as walking is concerned to walk ten or twenty 
or forty feet to the east, west, or in any other direction. Thus 
locomotion is a prime example of response generalization (xiii). 

We must now take note of the second major factor determining 
orientational behavior. All of these forms of distance reception, and 
especially those concerned with vision, constitute uninterrupted 
stimulus generalization continua which parallel the actual distance 
of objects within various ranges. It follows from this that an object 
whose stimuli have been conditioned to a reaction at one distance will tend 
to evoke the reaction at any distance from which the stimuli may be received. 

In the case of adience, to food for example, the portion of the 
distance-reception continuum which is primarily conditioned to 
the object naturally is that which corresponds to a minimal dis- 
tance, since the organism must make actual contact with food 
before it can eat. Even if the first reinforcement did not involve 
I^omotion, sooner or later this will be the case, with the result 
^at locomotion must inevitably become reinforced in connection 
imnn-f. ^ sizc of thc visual image concerned. When the 

mage receded later even from a greater distance, this will 

exZtTn ^ habits of receptor 

S L hn^ ■’ karned. which will serve to 

°n the dheetta’' °hject, and (2) locomotion 

TJ m T ,He activity will be 

e«r Tordt m reinforcement. How- 

conUnue its locomotion for a gre^e^le ' JthTf 
the factor of response generalStion (pX) ’ 

Thus It appears that adient behavior 
tnust be highly generalised both as to dlr'eclrdTsm ^Iti^ ! 
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Generalizing from the above considerations we arrive at the 
following theorem: 

THEOREM 53. Organisms capable of distance reception and com- 
pound trial-and-error learning will display adient behavior which is 
highly generalized in respect to both direction and distance. 

In the matter of abience the reasoning is much the same, 'though 
an intriguing problem arises here. Despite the fact that adience 
and abience are exactly opposite in the sense of their behavioral and 
adaptive outcomes, in the case of appreciable distance they involve 
for the most part exactly the same activity — namely, locomotion. 
In a strictly objective theoretical system this presents a question. 
Why docs a dynamically injurious situation lead to locomotion 
away from the relevant object or point in space rather than to 
locomotion toward it, or just to locomotion without any objective, 
i.e., mere foot and leg movements leading to no place in particular? 

The answer is believed to lie in the fact that the beginning of 
adience and abience ordinarily consists of an orientation movement^ i.e., a 
turning of the body as a whole in such a way that the object will 
be in front of the body in the one case or at the back of the body 
in the other. This orientational maneuver may be acquired by the 
process of chaining or compound trial-and-error learning, since 
orientational turning is a necessary preliminary to the success 
(reinforcement) of the activity as a whole. The point is that the 
responses of both adience and abience arc patterned, as well as 
the stimuli in the situation. But once the turning or orientation 
of the body as a whole has occurred, the locomotion may continue 
much the same in the two cases. 

Generalizing on the basis of the above considerations wc arrive 
at our next theorem: 

theorem 54. Organisms capable of distance reception and com- 
pound trial-and-error learning will display abient behavior which is 
highly generalized in respect to both direction and distance. 

Theorems 53 and 54 arc amply confirmed by universal observ’a- 
tion of both human and lower animal subjects. Moreover, both 
theorems arc supported by ingenious experiments by Brown {13, 
pp. 434 fT.). 
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where sEr is the excitatory potential as conditioned, sEr is the 
effective or generalized reaction potential, d is the difference 
between the original conditioned stimulus and the evoking stimulus 
in j.n.d. units, and the exponent, j, is an empirical constant. This, 
taken in conjunction with the foregoing, means that in the case of 
adience the potentiality of the organism to approach the reinforc- 
ing object, if it is visible in open space, will have a characteristic 
gradient which will be approximately a negative growth function 
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riouKE 51. Graphic representation of the theoretical ahape of a supposedly typical 
gradient of adient reaction potential (upper curve) and of a supposedly typical gradient 
of abient reaction potential (lower curve). The sign of the adient reaction potential is 
arbitrarily taken as positive. Note the negative growth nature of both functions and the 
steeper slope of the abient function. Plotted from Table 31. 

of the reception continuum between the organism and the object* 
with the high end of the gradient at (he object. 

In the case of abicncc the potentiality of the organism to with- 
draw from an object will have a characteristic gradient which uill 
also be a negative growth function of the reception continuum lead- 
ing away from the object, the high end of the gradient again being 
at the object. A systematic scries of illustrative numerical theoretical 
values, calculated by means of equations 54 and 55, arc shown in 
Table 31. Graphic represe ntations arc presented in Figure 51. Thus 

latter will be u*ed here to open the various aspects of the pro^jlcm to 

For further commenu on the computational methodology, see final terminal note. 
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far the gradients appear to be the same as those related to the dis- 
tance stimulus continuum, though the latter arc not necessarily 
identical in the two cases. Moreover, the sign of the reaction poten- 
tial (bEr), i.e., the direction of the locomotion, will be opposite in 
the two cases. 

Generalizing on the preceding considerations we arrive at our 
next two theorems: 


THEOREM 55. Adient behavior will display a relatively weak reac- 
tion potential when the organism ts Jar from the object, which will 
grow progressively stronger as the object is approached, the strength 
of the reaction potential being a negative growth function of the dis- 
tance of the organism from the object* 

theorem 56. Abient behavior will display its maximum reaction 
pcltnhal close to the object, but this wilt decrease as a negatiuc growth 
Met, on of the distance of the organism from the object.^ 

isfc but highly important charactcr- 
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direction eauaU continuum extends in every 
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gradient, as stated in Theorems 55 and 56 . direction of the slope of the two 
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axis. Obviously this field of behavior potentiality, which operates 
on the principle of the inverse exponential or negative growth 
function, is not to be confused with electromagnetic or gravitational 
fields which operate on the principle of inverse squares. 

Generalizing on the above considerations, we arrive at our next 
theorem: 

THEOREM 57. Both adient reaction potential and abient reaction 
potential in free space constitute plane fields of reaction potentialities. 

In continuing this account of some quantitative principles of 
adience and abience we must note that the type of stimulus general- 
ization assumed is that characteristic of a strictly naive organism. 
By naivete in this context we mean the absence of discriminatory 
differential reinforcement. It will be recalled (<?, pp. 267 ff.) that 
differential reinforcement produces a progressive diminution in 
stimulus generalization, i.e., a steepening of the net generalization 
gradient. Now in the case of strictly static stimulus objects, c.g., 
electrodes capable of delivering a moderate electric shock, no 
reinforcement whatever of abient reaction will occur at any dis- 
tance from the object beyond actual contact, and the danger of 
shock even from accidental movements is zero when the organism 
as a whole is a relatively short distance away. It follows that as the 
organism is subjected to the sophistication of differential reinforce- 
ment in abient situations with static abient objects, the gradient of 
reaction potential will steepen {8, p. 267), tending ultimately to a 
zero asymptote at a relatively short distance from the object. The 
rate of the occurrence of this steepening sophisticated discriminatory 
process will obviously vary with circumstances, though this does 
not particularly concern us here. It is noteworthy that no such 
differential reinforcement takes place in the case of static objects 
yielding reinforcement to adient behavior in completely open 
space.* 

* In 1944 Miller {13, p. 450) pointed out with admirable sagacity that the steepen- 
ing in the slope of the gradient of reaction potential in the case of abience may also 
occur in special spatially restricted situations in the case of adience: 

"If the individual is consistently rewarded for approaching near pals but not far 
ones, he should learn to discriminate on the basis of cues indicating distance and cease 
attempting to approach far goab. Such learning actually seems to occur in the cmc o 
adults, who will not attempt to reach through small openings for objects obviously 
more than an arm’s length away. In these situations learning produces an approat* 
gradient which falls off very steeply, in an almost step-wise manner at about the hmit 
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Generalizing on the basis of the above considerations, we arrive 
at our next theorem: 

THEOREM 58. With sophisticated organisms operating in open space, 
the gradient of aUent reaction potential to static objects at its point of 
maximum slope mil be steeper than that oj adient reaction potential 
at its point of maximum slope. 

On the side of empirical verification we are fortunate in having 
m Brown’s experimental work certain critical results bearing on 
o adtence and abience. For example, he trained hungry albino 
rats to run down a 200-centimeter alley to secure fool During 
of ingenious little harness constructed 
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point of shock. Thus our deduction regarding the general nature 
of the slope of the abient gradient finds experimental verification, 
though Brown’s results throw no light on the nature of its curvature 
because, again, only two points on the gradient were determined. 
Finally, the steep slope of the abient gradient as compared with 
that of the adient gradient yields ample empirical support for 
Theorem 58. 

As another pair of primary quantitative characteristics of adience 
and abience, we shall consider the relationships of approach and 
avoidance to primary motivation and incentive (VIII). The best 
evidence now available indicates that so far as primary motivation 
is concerned reaction potential is a monotonic function of drive 
multiplied by incentive, i.e., stimulus intensity and habit strength. 
This means that if the hunger involved in the adient generalization 
gradient shown in Figures 51 and 56 should be decreased so that 
the drive (D) falls 33^ cent, each value on the gradient would 
be reduced by one third. A general flattening of the gradient would, 
of course, result, together with a convergence of the gradients 
produced by the respective drives as shown by the broken-line 
curve of Figure 56. 

Turning next to the matter of incentive, which is usually con- 
sidered an aspect of motivation, we note the influence of increasing 
the amount of the food displayed as the adient object. Clearly, the 
larger the amount of food which is presented in the original reinforce- 
ment situation, the stronger will be the resulting reinforcement (8, 
pp. 131 ff.). By the above formula, the larger the amount of food 
(VII, VIII), the stronger will be the generalized bEr at any given 
distance. Thus the incentive, K, rather than the drive, D, is varied 
here. However, if the K is decreased by a third through a diminu- 
tion in the food presented, it is evident that the resulting cfTcct on 
the bEr \vill be the same as if the K were left constant and the D 
were reduced by one third. This means that owing to the multipli- 
cative nature of the K and D relationship to bEr, even though the 
details of the computations were different so far as the general 
characteristics of the gradient of generalized reaction potential arc 
concerned, the two types of motivation modifications would result 
in exactly parallel outcomes. 

Generalizing from the above considerations, wc arrive at our 
next two theorems: 
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THEOREM 59. Neither adient nor abient reaction potential will 
change the exponential constant of its gradient under various degrees 
oj drive (Z)), but the height of the gradient at the focal object and 
throughout its course will be greater for strong than for weak motivations. 
THEOREM 60. Adient reaction potential will not change the expo- 
nential constant of its gradient under varying incentives {IQ, but the 
height of the gradient at the focal object and throughout its course will 
be greater for strong than for weak incentives. 


Excellent empirical evidence of the general soundness of Theorem 
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the oscillation function, the principle of less work (8, p. 293) implies 
that a sophisticated organism will take a straight path to an adient 
object. 

The situation concerning the abient path is much the same. The 
reinforcement, such as it is in the case of abience, tends also to 
favor a straight path. However, once an obstacle, or the chance 
effect of oscillation, has diverted the path from a straight line 
there will be no tendency for the organism to swing back to it, 
since abience paths diverge and every point of the compass satisfies 
the condition that the visual image of the abient object be minimal. 

Generalizing from the above considerations, we arrive at our 
next two theorems: 

THEOREM 61. Adience in free spaccy within the limits of the oscilla- 
tion function (bOr) will tend to a straight line toward the adient 
object. 

THEOREM 62. Abience in free space will tend to a straight line 
but irregularities from it due to the oscillation function or minor 
obstacles will tend cumulatively to produce deviation from a straight 
line more than in the case of adience. 

General observation indicates that both adient action and abient 
action tend to be linear, especially near the focal object where the 
reaction potential is relatively strong, though no empirical evidence 
has been found bearing on the presumably greater tendency for 
abience to deviate from a straight line. 

The Inleraclion of Two Field Gradients of Adient Reaction Potential 
Having considered the field gradients of adient and abient reaction 
potential when standing singly, we must now extend our examina- 
tion to various natural complications. These complications involve 
the interaction under various conditions of (I) two adient gradient 
fields; (2) two abient gradient fields; and (3) an adient gradient 
field and an abient gradient field; they also involve the influence 
which the imposition of simple barriers has on these^ gradient 
fields. In the present section we shall consider the interaction of two 
adient gradient fields. 

Let us suppose that an organism has received adient reinforce- 
ment to an object, that exact duplicates of the object, Oi and Us, 
arc placed some distance apart in free space, that the organism is 
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placed between the two objects at a point nearer Oi than Oj, and 
that the relevant receptors are adequately exposed to both objects 
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Looking up the value of .2673 in Guilford’s probability table B 
(4, p. 530), we find that it corresponds to a probability value (p) 
of approximately .106 + .500, or .606. This means that under the 
assumed conditions Oi would be chosen 60.6 per cent of the trials, 
and O 2 would be chosen 100 — 60.6, or 39.4 per cent of the trials. 

Unfortunately we do not yet know the functional relationship 
of distance in feet, say, to the d values of the stimulus distance 
continuum, so we cannot give a representation of p as a function 
of the various positions as stated in feet that a subject could take 
between two adient objects. This, however, should be made pos- 
sible by means of future empirical investigation. 

Generalizing on the above considerations, we arrive at otar next 
two theorems: 

THEOREM 63. Other things equal, in a competing adient-adient 
situation in which the organism is placed midway between two dupli- 
cate focal objects with clear distance reception for each, the organism 
will be as likely to take a path leading to one object as to the other, but 
if placed nearer one object it will he more likely to choose that object. 
THEOREM 64. Other things equal, in situations involving competing 
adient-adient reaction potentials to duplicate adient objects, the greater 
the disparity in distance from the organism to the respective objects, the 
greater will be the difference between the two choice probabilities. 

While many experiments have been performed on distance dis- 
crimination, one by Klebanoff has been found which has a real 
bearing on the validity of Theorems 63 and 64. At the outset of the 

involved in rather than the usual process of simple subtraction, in the computation 
of probability in the equation below. My own feeling regarding this is definitely uncer- 
tain. Even so, we arc retaining the process in order to call attention to the problem. 

As usual In such matters, a suitable experiment would decide the issue. 
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1.504 ^ 1.194 
1.4140- 

1.504 - 1.194 
1.414 X 1 

1.414 
= .2673. 

= .106 + .5000 = .606 
= 1.000 - .606 = .394. 



trial he made the adient objects clearly available to the distance 
receptors of the organism. According to Miller (75, p. 444), 

Klebanoff (1939) trained hungry rats to secure food by ap- 
proaching whichever end of an alley was distinguished by a 
light and a buzzer. Then he placed them in an approach- 
approach competition by turning on the lights and buzzers 
at both ends of the alley. He found that, if the animals were 
started some distance away from the center, they always went 
directly to the nearest goal. If started at the center they went 
quickly to one goal or the other with little tendency to vacillate. 
A second phenomenon, closely related to the matter of the prob- 
ability of choice in competing adient-adlent situations, is that of 
the reaction latency or choice time. Just 35 the probability of 
a given reaction dominating a given competitive situation is a 
function of the difference between the two competing reaction 
potentials (5, p. 163), it is here explicitly assumed that reaction 
latency is a decreasing monotonic function of the difference (d') 
between two competing reaction potentials. It follows that, other 
things equal, the farther apart the adient objects arc, the farther 
will be the organism from each of them, and, by equation 54, the 
weaker will be the reaction potential to each and so the smaller the 
d' between the reaction potentials upon which a reaction latency 
can be based. Similarly, for constant distances between the adient 
objects the nearer the organism is to a point midway between them, 
the less the d' and so the greater the stR. From these considerations 
we arrive at our next two theorems: 

THEOREM 65. Other things equal, in an adient-adient competing 
situation involving duplicate objects, the greater the separation of the 
objects, the greater will be the reaction latency. 

THEOREM 66. Other things equal, in an adient-adient competing 
situation involving duplicate objects, the less the disparity in distance 
between the organism and the respective adient objects, the greater will 
be the reaction latency. 

Recalling Theorems 59 and 60 in connection with the compe- 
tition of two adient reaction potentials, we obviously have at once 
a scries of additional theorems concerning the probabilities and the 
latencies of reaction occurrences as dependent upon the amounts 
of (a) drive motivation (D) and (b) incentive motivation (K). 



BEHAVIOR IN SPACE 


231 


Generalizing on these and related considerations we arrive at 
our next two theorems: 

THEOREM 67. Other things equals in an adient’adient competitive 
situation involving duplicate objectSy the greater the motivation^ the 
greater will be the probability of the choice of the nearer adient object, 
and the less the latency (stn). 

THEOREM 68. Other things equal, in an adieni-adient competitive 
situation involving duplicate objects, the greater the incentive (K), the 
greater will be the probability of the choice of the nearer object and the 
less the latency (eta). 

In the above adient-adient situations, the competition has been 
homogeneous in the sense that the adient objects have been dupli- 
cates. At this point we pass to the consideration of two adient- 
adient competitive situations which are heterogeneous; i.e., situa- 
tions ih which the adient objects are not duplicated. In all of these 
the organism is placed midway between the objects. However, in 
the first case one of the objects has a greater incentive (K') value 
than the other; e.g., it consists of a larger amount of food (VII; 
VIII). A second situation of this general nature is that in which 
the organism has a greater need up to a certain limit of inanition 
(e) of the one or the other object (V B; VIII). It follows from 
Theorems 59 and 60 that the organism will choose that adient 
object which has the greater incentive value and the one for which 
it has the greater need. 

Generalizing from these and related considerations we arrive 
at our next two theorems: 

theorem 69. Other things equal, in a heterogeneous adient-adient 
competitive situation with the organism placed midway between the 
adient objects, one of which consists of a greater quantity of the rein- 
forcing substance, the organism will tend to choose the direction of the 
object which has the greater incentive value. 

theorem 70. Other things equal, in a heterogeneous adient-adient 
competitive situation with the organism placed midway between the 
adient objects, for one of which the organism has a greater need or 
drive (D) than for the other, the organism will tend to choose the 
direction of the object involving the greater drive. 
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As a final relationship in the present adient-adient series, we 
take the case in which one adient object is displayed at a certain 
stimulus continuum distance (d^ from an organism and a duplicate 
adient object is displayed at a short distance beyond the first. It is 
evident that at the outset this is no competitive situation, but rather 
a summative one. There arc numerous complex theoretical prob- 
lems here related to afferent interaction with which we are not yet 
in position to cope. Assuming that the interaction effects are less 
in the aggregate than the original uncomplicated reaction potential 
of the more remote of the two objects, we may conclude that the 
joint reaction potential will be greater than that for the near object 
alone. It follows from this and the monotonic relationship of bTr 
as a function of bEr (Postulate XIV) that the reaction latency 
toward both Oi and O2 will be less than that toward cither one 
alone. 

Generalizing from these considerations we arrive at our next 
two theorems: 


THEOREM 71. Other things equal, when duplicate adient objects 
are placed on a line with and in the same direction from the organism, 
the adient latency (bIr) will be less than that for either object alone. 
THEOREM 72. Other things equal, when an organism is presented 
with duplicate adient objects on a line with and in the same direction 
from the organism, thejarther away the more remote object is from the 
organism, the greater will be the reaction latency. 


bearing on the validity of Theorems 65 
f ound, though the experimental procedures for 

Ld straightforta"™‘'®''‘”“ 


The Interaction of Two Field Grodlents of Abient Potentiol 

obiect's fn mV" °--gamsms toward 

^hfllVmVbTnt mirnsMpV tha ^ 

seme time on, f Perhaps the simplest and at the 

same time one of the most mterestintr r»f tv. i » i • i . 
fnnnH ?n • i • cstmg of the latter relationships is 

found in the experimental situation where the organism is enclosed 
in a long narrow space or aUcy. at each u • 

object from which the organi^ hJllt t ^ f 

ment, such as an electric shock Let «■? 

. Lict us assume that the organism 
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has received the same number of shocks of equal intensity from 
each object; the height of the gradient of abient reaction potential 
is therefore at the same level at each end of the enclosure. If we 
take this level of reaction potential at 4.00£r in each case, the abient 
gradients being the same as the one shown in Figure 51, we shall 



PiouRE 52. Diagrammatic representation of the interaction of two homogeneous 
abient gradients originating from points 0i and Of respectively. Note that wherever the 
animal is placed it will move in accordance wth the dominant difference in reaction 
potential (broken line) until a point is reached at which thb difference is zero. Naturally 
this occurs where the two abient gradients intersect. Since the two abient gradients are 
symmetrical this intersection is at the mid-point of the enclosure, namely, a point 
35 j.n.d.’s from each abient object. The differences were calculated by the equation 
03 ). 

have the interaction of the two gradients as they appear in Figure 
52, where the two abient gradients are assumed to originate at 
points Oi and O 2 respectively. 

Now, wherever the animal is placed on the scale of j.n.d. dis- 
tances from the points of reinforcement, with the exception of the 
midpoint, there will be an imbalance of reaction potential amount- 
ing to the difference (-^) between the two gradients. The difference 
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is represented in Figure 52 by the broken lines. This means that 
if the animal finds itself 10 points from the right-hand extreme of 
the alley, there will be a net reaction potential of 2.525 .25 or 

2.374<r to move toward the left. However, as the animal moves 
farther toward the left, the difference grows less and less until the 
two primary gradients cross, at wWch point the difference neces- 
sarily becomes zero. Here, then, the animal tends to cease moving 
progressively in either direction. Evidently at this intersection we 
have what has been called a point of stable equilibrium (9, p. 92; 
73, p. 436); i.e., a point at which the interacting gradients tend to 
produce no movement. 

Generalizing from the preceding considerations we arrive at our 
next two theorems: 


THEOREM 73. When an organism is placed near one end of a 
restraining alley at each end of which there are duplicate abient objects, 
the organism will move to a point of equal reaction potential midway 
beUvem iht objects, where it will lend to cease systematic progressive 
movements toward either end. 

theorem 74. When an organism is placed in a restraining alley 
at each end of which there are duplicate abient objects, the closer the 
organism IS to one end when released, the greater will be the prob- 
ubtltty of action leading to the midpoint of the alley, and the shorter 
the latency of the act in question. 


This striking example of behavioral equilibrium is closely 
Perham''th° ™es of equilibrium in the physical sciences. 

he TT’ **>' weight is displaced from a point 

t^rrratoXru: ::“ra:;Tn r h^b^"™ 

laws and therefore neither thro™lyTar'L'‘®''''“‘rT’' 
acteristics of the other. The fact that the h 

bederivedfromdistinctlydifrere„^4:^LtTd“^^ 

of molar behavior is not the science of molar phX 

are s°aTno’bT" "T"’' ^ of reaction potential 

are sa.d to be equal at the midpoint of the restraining alley, it 
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must be understood that this statement, even for an individual 
organism, necessarily holds only for the average. Actually the two 
reaction potentials still continue to compete. And since each is 
subject to its own individual and uncorrelated behavioral oscillation 
tendencies (5, p. 308) it is inevitable that the organism will not 
become immobile when it reaches the center of the alley. The 
momentum, both physical and behavioral, of the preceding move- 
ment will presumably carry the organism beyond the midpoint at 
first; this will tend to be corrected, which will cause further move- 
ments of a pendular nature. But quite apart from the pendular 
movements there ^vill inevitably be irregular oscillating move- 
ments because of the principle of behavioral oscillation (gOa) as 
such. 

Generalizing from the preceding considerations we arrive at 
our next theorem. 

THEOREM 75. Other things equals when an organism is placed in a 
restraining alley at each end of which are duplicate abient objects^ 
the organism even when at the midpoint will continue to oscillate 
short distances forward and backward from this point as a center. 

We are fortunate in having critical experimental evidence bear- 
ing on the validity of the above theorem. Miller {13, p. 445) reports 
an experiment by Klebanoff: 

He trained another group of animals to escape an electric shock 
by running away from whichever end of the alley was dis- 
tinguished by a light and buzzer, and then placed them in an 
avoidance-avoidance conflict by turning on the lights and 
buzzers at both ends of the alley. When released a considerable 
distance away from the center, all of the animals started by 
avoiding the nearest light. After running in one direction these 
animals stopped and turned back, remaining in conflict be- 
tween the two lights. When released at the center, they started 
more slowly than the approach-approach animals, vacillated 
much more, and remained nearer the starting point. 

Thus Theorems 73, 74, and 75 find empirical substantiation. 

A reexamination of Figure 52 wiU show that the gradient which 
finally determines the abient movements occurring in the restricted 
abient-abient situation is that represented by the broken line of 
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gradient differences (— ). It is also c\ddent that deviations in reac- 
tion potential resulting from the oscillation factor must operate 
against this gradient, and that the steeper this difference gradient 
is, the more restricted the oscillatory movements must be. Now, it 
is easy to show that, other things equal, the nearer the abient 
objects are to each other, the steeper will be the two difference 
gradients. 

From these considerations flow our next two theorems: 


THEOREM 76. Other things equals an organism placed in a restrain- 
ing alley with duplicate abient objects at either end will make, on the 
average, shorter excursions from the middle toward the respective ends 
the closer the abient objects are to each other, 

THEOREM 77. Other things equal, an organism placed in a restraining 
alley with duplicate abient objects at either end will make, on the 
average, shorter excursions from the middle toward the two ends as 
the reaction potential at the abient objects increases, whether this is 
caused by increased primary motivation {D) or increased incentive {K'), 


No empirical evidence bearing on the validity of Theorems 76 
and 77 has been found, though the methodology of setting up such 
experiments is simple and obvious. The results of an experimental 
test of Theorem 77 would be of special interest because its valida- 
tion depends in a critical manner upon the change in the intensity 
of the reaction potential as related to the parallel change in the 
osci atory movements. This is a complex and uncertain matter 
because of our lack of knowledge concerning the empirical con- 
stants mvoly^. ^ experimental investigation of this problem is 
likely to yield rich returns for the effort required. 

e cases of interacting gradients of abient reaction potential 
employed m the present analysis have so far been homogeneous. We 
pass now to the consideration of a few cases involving heterogeneous 
abient reaction potentials. Let us assume, accordingly, that at one 
f-Y the abient object (O,) has a maximum 

object (O:) has a maximum reaction evocation potential of 2.000<r. 
The respective abient gradients are shown in Figure 53, together 

r , (-) gradients in brLen lines Lalo- 

gous to those m Figure 52. An examination of this figure shows 
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that the point of the intersection of the two abient gradients, i.e., 
the point of zero difference in bEk, lies distinctly nearer the weaker 
of the two abient objects. Moreover, the slope of the difference (— ) 
gradient (broken line) has approximately the same steepness 
toward the weaker abient object as toward the stronger. 

HUMBER OF j.r.A's PROM POIHT OF REINFORCEMENT (Oj) 



NUMBER OF FROM TOIHT OF REINFORCEMENT C0|) 

fiouke 53. Diagrammatic representation of the interaction of two heterogeneous 
abient gradients originating from points 0| and Ox, respectively. Note the asymmetry in 
the two abient gradients, but the relative symmetry, so far as they go, of the resulting 
difTercnce gradients (•*•) in reaction potential (broken lines). 

It will be recalled that in the case of homogeneous abient gradi- 
ents just considered, wc found it easy to arrive at the determination 
of the point of zero reaction potential by construction methods. In 
the case of two abient gradients with difTcrent maxima it is still 
possible, as wc have just seen in Figure 53, to arrive at fair approxi- 
mations by means of graphic methods. For a precise determination 
of the point, however, as well as for a general statement of the law, 
'VC require an equation. This is not difficult to secure. Since at the 
point of zero difTcrcncc in reaction potential both the opposing 
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generalized reaction potentials (sEb) arc equal, wc have, from equa- 
tion 55, 


bEb X 10“*^ = sEn X 


where bEr is the maximum reaction potential of the stronger 
abient object, bEr is the maximum reaction potential of the weaker 
object, A is the distance between the objects in j.n.d.’s, and d is 
the distance between the organism and the stronger object. It 
follows that, 


lOM 

jd - j(A - d) 
2d - A 


sEr 

bEr 

= log- 


, bEr 

log-pT 
. BC<R 

j 

] bEr 

log^ 

B&r 


+ A 


.•,d = -i-2 (56) 

illustrated in a preliminary way 
PP ying I to the homogeneous situation analyzed above, in 

which hEr bEr. This reduces log ^ to zero so that equation 56 
becomes, '' 


heterogrneol dmatio^jnsfeotid^"] ‘’'d 

.E; = 2.000,r. Substituting, writat^ ’ = 4.000crand 


4.000(r 


2 
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log 2 

.02 


+ 70 


.301 

.02 


+ 70 


15.05 + 70 


.*. d = 42.53 j.n.d.’s, 


which agrees as well as could be expected with the graphic solution 
given in Figure 53. 

Generalizing on the preceding considerations we arrive at the 
following theorem: 


THE OREM 78. Other things equaly an organism placed in a restrain’ 
ing alley with heterogeneous abient objects at the two ends will ap- 
proach a point of equal reaction potential which will Jail farther from 
the stronger adient object at aj.n.d. distance from it represented by the 
equation, 




+ A 


d = 


The principle of the oscillation of reaction potential in this con- 
text brings us to our next theorem: 

THEOREM 79. Other things equal, when an organism is placed in a 
restraining all^ with heterogeneous (abient) objects at the two ends, 
once its pendular reactions have become relatively stabilized at the 
point of zero reaction potential difference, the distribution oj oscillatory 
reactions about this point will, within the limits of sampling errors and 
despite the asymmetry of the basic reaction potential gradients, be 
^Tnmefrical to a close approximation. 

Recalling in connection with Tlicorcm 78 our conclusions formu- 
lated as Tlicorcms 57 and 60, let us suppose that in the 
situation just considered cither the motivation or l!ic (negathe) 
incentive lias been increased in the case of the dominant object 
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from the 4.000a assumed above to 4.5000a. Substituting appropri- 
ately in equation 56, we have, 


d = 


, 4.500 

2.000 
.02 


+ 70 


2 

log 2.25 

.02 


-h70 


.35218 

.02 


-1-70 


17.61 -f 70 


d = 43.81 j.n.d.*s. 


But 43.81 > 42.53. 

Generalizing from these considerations we arrive at our next 
theorem: 


THEOREM 80. Other things equals with an increase in the primary 
motivation (D) or {negative) incentive (A"') o/ one oj two otherwise 
equal abient reaction potentials interacting in an organism placed 
Within a restraining alley, the point of zero reaction potential difference 
wi move to a point farther from the object which has the increased 
motivation or incentive. 


In our consideration of the interaction of adient fields of reaction 
Completely open space, whereas in the con- 
we hfv^ 1 of abient fields of reaction potential 

narrow M organism was restrained within a 

in the int f ^ consider the behavior potentialities 
L dl, ■'> completely open space; 
to alume th »hy in some sense it was necessary 

to assume the restrammg alley in the formulation of Theorems 73 to 

aetbf w^shT'' °f abient-abient inter- 

sented tn F.gure 54 and that the«t are situated 30.1 j.n.l units 
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apart in free space. We shall further assume that where d = 0 each 
object has a reaction potential of 2.000a'. From appropriate compu- 
tations by means of equation 55 it appears that at the midpoint 
between* the two abient objects the actual reaction potential in 
each direction is l.OOOo*. This means that every point on circles A 
and A' has a reaction potential or vector of 1.000(r in the direction 
away from its respective point of origin, which serves to emphasize 



X 


figure 54. Diagrammatic representation of the interacting fields of theoretical reac- 
tion potential arising from the supposititious abient objects 0i and 0* in free space. The 
circles arc drawn to represent loci of equal reaction potentials as follows: A * “ I. Off; 

B = B' « .9ff; C = C' = .8ff; D « D' « .7c; and E E' « .6c. 

the fact once again that we are here dealing with reaction potential 
fields or two-dimensional space rather than with mere linear reac- 
tion potential gradients. 

Now since, according to Theorem 73, in the present situation 
the two opposing reaction potentials are equal, no consistent reac- 
tion tendency will occur toward cither abient object. Because 
these opposed reaction tendencies arc in an exact line they will 
completely neutralize each other so far as that factor alone is con- 
cerned. It follows that from this source there will be no lateral 
niovcmcnt. However, the operation of the principle of behavioral 
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Started with a zero reaction potential it follows that the reaction 
potential must gradually rise to a maximum, after which it will 
gradually fall until it is less than the inhibition yielded by the 
locomotor activity involved; the organism will then cease respond- 
ing to the abient objects in question. 

Generalizing from these considerations we arrive at the following 
two theorems: 

THEOREM 81. Other things equal, a naive organism placed midway 
between two duplicate abient objects in free space will tend to move 
in a direction at right angles to the line connecting the two objects. 
THEOREM 82. Other things equal, a naive organism placed midway 
between two duplicate abient objects in free space will have at the outset 
a zero mean lateral reaction potential vector at the line connecting the 
two abient objects. This potential will increase progressively as the 
angle from the organism to the abient objects increases, until this is 
over-balanced by the diminishing strength of abient potential with the 
increasing distance of the organism from the abient objects, after which 
it will gradually decrease and the organism will cease locomotion so far 
as these objects are concerned. 

Although no adequate evidence for the detailed validation of 
Theorem 81 is available, general observation tends roughly to 
confirm it. Moreover, Miller {13, p. 445) reports that KlcbanofTs 
rats when in a situation substantially like the one here under con- 
sideration, “showed a definite tendency to try to escape to the side 
and up out of the alley.” No empirical evidence whatever has been 
found bearing on Theorem 82, though the methods used by Miller 
and his associates would presumably, with a little adaptation, serve 
to secure it. Such evidence; particularly if based on data from 
extremely naive animals, might easily lead to a determination of 
the relationship between the mode of combination of behavioral 
vectors and that characteristic of physical vectors. It is tempting 
to assume the physical vector analogy in this situation, but such an 
assumption is extremely risky unless supported by convincing em- 
pirical evidence. However striking the analogy, it must never be 
forgotten that molar behavior theory is not molar physics. 

Wc may add here that the principle stated in Theorem 81 has 
become quite well known through the work of Lc\rin {10), who 
seems to have been the first to put it fonvard. Lewin, however, 



242 


A BEHAVIOR SYSTEM 


oscillation may be expected to initiate from time to time small 
movements in all directions. Movements toward the two abient 
objects will meet with increasing opposition, but those at right 
angles to the line connecting the two objects will have no opposi- 
tion.^ Assuming as a first approximation that behavioral vectors 
in quite naive subjects operate roughly as physical vectors, even a 
small movement to one side of the line will unbalance the otherwise 
completely opposed reaction tendencies arising from Oi and 02> 
which will give rise to a combination vector away from the line 
O 1 O 2 (Figure 54). This lateral or summational vector must grow 
larger as the angles aOiP and dOjP grow larger, say to aOiR and 
fl02R, depending on which side of the line O 1 O 2 the first chance 
lateral movement occurs. 


Finally we should observe that the path of lateral movement must 
tend to fall at points where the two opposed zones or fields of reaction potential 
are ^ etiual. Thus the organism will paa successively through those 
opposed fields have a reaction potential of 
.9(r(P), thence through the points where they have bEb»s of .8<r(Q}, 
thence through the points where they have bEr’s of .7o-(R), and 
so on. According to equation 19, the lines connecting P with Oi and 
U 2 must at any given instant be equal, and the same must be true 
m the case of R. By ordinary geometry, POiOj is an isosceles tri- 
angle, and therefore the line Pa must cut line at right angles. 

e wme, of course, applies in the case of triangles QOiOa and 
notpntfaW# f ‘otersections of the circles of equi-reaction 

the line o n ° organism takes in flight from 

will be at * • P*^ccd midway between the abient objects, 

mustTe Zrf“ " ‘o Une 0.0,. lines O.P and O.P 

0?R Ld O r ‘“Ser the lines 

invou" d t h be both reaction polntials 

summad™ 

regardless of how great the ? ‘^“'"on potential must be zero 

in the supposed simat'oi ‘ “’I' 

7 cro as a limit sector summation must approach 

zero as a hm.t mth eontmued flight from n. And since the course 

* njcanj there will be no otmn*' ■ 

the amount of work of perforTninV^K**i*°” except for the inhibition arising from 
*ng »nc locomotor movements necessarily involved. 
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Started with a zero reaction potential it follows that the reaction 
potential must gradually rise to a maximum, after which it will 
gradually fall until it is less than the inhibition yielded by the 
locomotor activity involved; the organism will then cease respond- 
ing to the abient objects in question. 

Generalizing from these considerations we arrive at the following 
two theorems: 

THEOREM 81. Other things equal, a naive organism placed midway 
between two duplicate abient objects in free space will tend to move 
in a direction at right angles to the line connecting the two objects, 
THEOREM 82. Other things equal, a naive organism placed midway 
between two duplicate abient objects in Jree space will have at the outset 
a zero mean lateral reaction potential vector at the line connecting the 
two abient objects. This potential will increase progressively as the 
angle Jrom the organism to the abient objects increases, until this is 
oveT’-balanced by the diminishing strength of abient potential with the 
increasing distance oj the organism from the abient objects, after which 
it will gradually decrease and the organism will cease locomotion so far 
as these objects are concerned. 

Although no adequate evidence for the detailed validation of 
Theorem 81 is available, general observation tends roughly to 
confirm it. Moreover, Miller {13, p. 445) reports that KlebanofTs 
rats when in a situation substantially like the one here under con- 
sideration, “showed a definite tendency to try to escape to the side 
and up out of the alley.” No empirical evidence whatever has been 
found bearing on Theorem 82, though the methods used by Miller 
and his associates would presumably, with a little adaptation, serve 
to secure it. Such evidence,- particularly if based on data from 
extremely naive animals, might easily lead to a determination of 
the relationship between the mode of combination of behavioral 
vectors and that characteristic of physical vectors. It is tempting 
to assume the physical vector analogy in this situation, but such an 
assumption is extremely risky unless supported by convincing em- 
pirical evidence. However striking the analogy, it must never be 
forgotten that molar behavior theory is not molar physic^ 

We may add here that the principle stated 
l>ccomc quite well known through the work of Lewm {10), who 
seems to have been the first to put ii forward. Uwn, however, 
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apparently was relatively uninterested in the strictly spatial prob- 
lems under analysis here; he gave more serious consideration to 
analogies of a non-spatial nature, such as the tendency of children 
to avoid where possible both an unpleasant task and parental 
disciplinary action, the latter being a normal alternative to the non- 
performace of the task. Evidently the principles derived above from 
strictly spatial considerations will apply with certainty to the purely 
analogical situation only by chance. 

As a final case in the interaction of two abient field gradients, we 
take one in which the abient objects are again in completely open 
space, but are heterogeneous in nature, instead of homogeneous 
as in the situation just considered. We shall now assume that Oi 


has an abient reaction potential at d = 0 of 2.000cr, whereas O2 
has an abient reaction potential at d = 0 of 4.000cr. Appropriate 
compulations show that a reaction potential of l.OOOtr surrounds 
Oi at a distance of 15.05 j.n.d.’s, whereas an equal and opposing 
reaction potential surrounds Oj at a distance of 30.1 j.n.d.’s. 
Accordingly a figure analogous to Figure 54 could be constructed 
on this basis, additional circles being drawn with each abient 
object as the center, which would show the j.n.d. distance between 
the organism and the object where reaction potentials of 1.0<r, 
.J<r, .80-, .7<r, and .6<r respectively would fall. 

Now bere, exactly as in the case of the homogeneous abient 
o jects, t c naive organism will at the outset have no particular 
tendency to go in either direction from the line O1O2, but due to the 
action of the oscillation factor small deviations from the line will 
rh",i?'° ™ ^ide or the other the 

the ' r TT' increase progressively quite 

tio^ In h I'h ° ^°‘"'>8'neous abient-abient open-space situa- 

eac'tion no c 7T ' =nch that (1) both 

means X th? a minimum. This 

circles possessinir°th^"'™ * intersection of the 

reaction notmt;.! i, ’"‘crscctions of the circles of equal 

done it may readily be secnmThlTt? r’' “ 

to ward Oi ' “ cLes slightly, also 
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Generalizing on the basis of the preceding considerations, we 
arrive at our next theorem: 

THEOREM 83. Other things equals in a heterogeneous abient-abient 
reaction situation in open space the organism will take a path to one 
side of the line connecting ike abient objects^ and the path will curve 
in the direction oj the weaker object. 

No empirical evidence has been found bearing on the validity 
of Theorem 83, though it would be a relatively easy matter to set up 
an experiment for this purpose. 

The Interaction of an Adient and an Abient Field Gradient 

In the preceding two sections we have considered the interaction 
of two reaction potential fields of the same kind, either adient 
fields or abient fields. Now we must consider the interaction of two 
different kinds of reaction potential — that of an adient potential 
field with an abient potential field. There are two obvious situations 
where this is found. 

Let us take as our first case of adient-abient interaction a situa- 
tion in which Oi is an adient object with a reaction potential (at 
d = 0) of 3.000O-, and O 2 is an abient object with a reaction poten- 
tial (at d = 0) of 4.000(r. In addition we assume that the organism 
is placed as close as possible to O 2 on the line connecting the ^o 
objects. It is evident that under the assumed conditions the direction 
of the two reaction potentials, despite their different nature, will 
be the same, i.e., both will impel the organism in the direction of 
Oi. The magnitudes of the two reaction potential gradients are 
given in Table 31 and are shown in Figure 55 by the two continu- 
ous curves. Assuming that the summation is according to the sum- 
mation principle (v, 11), and ignoring probable but unknown 
afferent interaction effects, as well as those of inertia, mornenmm, 
and so on, the combination of the two sets of reaction poten la s 
(which operate in the same direction) yields the results represen e 
fiy the broken line in Figure 55. . 

An examination of this broken line reveals a characteri 
striking situation. At the beginning (Oz) the reaction po 
stands at a maximum of 4.2; it decreases to a j 

midpoint of the line from O 2 to Oi, after which it mcr 
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to a secondary maximum of 3.08 at Oi. Unfortunately it is impos- 
sible to translate these summated eEa values into speeds of locomo- 
tion because of the complication due to momentum and other 
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Generalizing on the basis of the preceding considerations, we 
arrive at our next theorem: 


THEOREM 83. Other things equal, in a heterogeneous abient-abient 
reaction situation in open space the organism will lake a path to one 
side oj the line connecting the abient objects, and the path will curve 
in the direction of the weaker object. 


No empirical evidence has been found bearing on the validity 
of Theorem 83, though it would be a relatively easy matter to set up 
an experiment for this purpose. 

The Interaction of an Adient and an Abient Field Gradient 

In the preceding two sections we have considered the interaction 
of two reaction potential fields of the same kind, either adient 
fields or abient fields. Now we must consider the interaction of t\\o 
different kinds of reaction potential — that of an adient potenti 
field with an abient potential field. There are two obvious situations 
where this is found. 

Let us take as ou^first case of adient-abient interaction a silua- ^ 
tion in which Oi is an adient object with a reaction v^nt 

d = 0) of 3.000(T, and O 2 is an abient object with a reaction P^Jjient 
tial (at d = 0) of 4.000<r. In addition we assume fil?LT^^^adients 
is placed as close as possible to O 2 on theUfiSi^nneQtjjj 
objects. It is evident that under the as^itiedconditions th broken line 
of the two reaction potentials, des^te their different glance at 
^ the same, i.e., both will impel the organism in the*^ 

Oi. The magnitudes of the two reaction potential ^ at 12.48 
given in Table 31 and are shown in Figure 55 by tb“ positive or 

ous curves. Assuming that the summation is accord 
mation principle (v, 11), and ignoring probab’ "‘f'n 
afferent interaction effects, as well as those of •- 


and 


<16 inose oi in . • t-a 

so on, the combination of the two sets of 

yields 


(winch 

by the broken line in Figure 55 
An - 


An examination of this broken line revc-" ndient-abient reaction 
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to a secondary maximum of 3.08 at Oi. Unfortunately it is impos- 
sible to translate these summated bEr values into speeds of locomo- 
tion because of the complication due to momentum and other 
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objects are apart inj.n.d.^s^ the smaller will be the minimal combined 
reaction potential and the more will this differ from the reaction poten- 
tials on the respective ends. 

Unfortunately we have here also no empirical evidence against 
which to check the above theorems, though as usual in this field 
the methods followed by Miller should render their validation 
relatively easy. Perhaps the most obvious method would be to place 
the organism at various points along the line joining Oi to O 2 
and measure the reaction latency of its locomotion, since reaction 
latency has a special inverse monotonic relationship to reaction 
potential (XIV). 

Perhaps because of its somewhat dramatic issue, our second case 
of adient-abient interaction is relatively well known, having previ- 
ously been discussed by Lewin {10, p. 92), Miller {13, p, 436), and 
the present writer (7, p. 288). It concerns a situation in which 
the adient and the abient objects, instead of being separate, occupy 
practically the same point in free space. In this way the two gradi- 
ents, instead of summating as in the last case, oppose each other. 
In order to facilitate our exposition we shall use the two gradients 
presented in Table 31 and employed in other situations. In Figure 
56, where these two gradients are represented, the abient gradient 
IS placed below because its direction is opposite to that of the adient 
gradient. Since their two directions are opposite, the gradients 
combine by the withdrawal principle (— , vii). 

The resulting differences are represented by the broken line 
which appears between the other two lines. A second glance at 
Figure 56 will show that this difference line begins at the left with 
large negative (or abient) values, crosses the zero line at 12.48 
j.n.d.*s, and then passes into a permanent phase of positive or 
adient values. It is evident that here, i.e., at the point where the 
difference value becomes zero, we have what is called a stable 
behavioral equilibrium; this means that except for the operation 
of the oscillation function, the organism will move neither toward 
the double or ambivalent goal object nor away from it. 

It is a matter of some interest to know exactly what the theoretical 
distance of the point of zero difference in adient-abient reaction 
potential is from the two objects. This is easily found owing to the 
fact that at this point the two reaction potentials are equal. From 
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objects are apart inj.n.d.'^Sy the smaller will he the minimal combined 
reaction potential and the more will this differ from the reaction poten- 
tials on the respective ends. 

Unfortunately we have here also no empirical evidence against 
which to check the above theorems, though as usual in this field 
the methods followed by Miller should render their validation 
relatively easy. Perhaps the most obvious method would be to place 
the organism at various points along the line joining Oi to O 2 
and measure the reaction latency of its locomotion, since reaction 
latency has a special inverse monotonic relationship to reaction 
potential (XIV). 

Perhaps because of its somewhat dramatic issue, our second case 
of adient-abient interaction is relatively well known, having previ- 
ously been discussed by Lewin {10^ p. 92), Miller (75, p. 436), and 
the present writer (7, p. 288). It concerns a situation in which 
the adient and the abient objects, instead of being separate, occupy 
practically the same point in free space. In this way the two gradi- 
ents, instead of summating as in the last case, oppose each other. 
In order to facilitate our exposition we shall use the two gradients 
presented in Table 31 and employed in other situations. In Figure 
56, where these two gradients are represented, the abient gradient 
is placed below because its direction is opposite to that of the adient 
gradient. Since their two directions are opposite, the gradients 
combine by the withdrawal principle (— , vii). 

The resulting difTercnces are represented by the broken line 
which appears between the other t\vo lines. A second glance at 
Figure 56 will show that this difference line begins at the left with 
large negative (or abient) values, crosses the zero line at 12.48 
J.n.d.’s, and then passes into a permanent phase of positive or 
adient values. It is evident that here, i.c., at the point where the 
difference value becomes zero, wc have what is called a stable 
behavioral equilibrium; this means that except for the operation 
of the oscillation function, the organism will move neither toward 
the double or ambivalent goal object nor away from it. 

It is a matter of some interest to know exactly what the theoretical 
distance of the point of zero difference in adient-abient reaction 
potential is from the two objects. Tliis is easily found owing to the 
fact that at this point the two reaction potentials arc equal. From 
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this and equation 20 tve aee able L write the equation. 

.Eh X 10 -« = ,e; X 10-r. 
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potential and the ritrhr represents the adient reaction 

the abient reaction potemhil wb™'^'^ primes) represents 

potential where it is assumed that. 
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Accordingly we have, 


> bEr 
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log^" 
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As an illustration of the use of this equation we substitute the 
relevant values involved in the preceding adient-abient interaction: 


d := 


I 

.02 - .01 


log 1.333 

.01 

= 

.01 

A d = 12.48 j.n.d.% 


which agrees very well wth the graphic solution represented in 
Figure 57. 

Now, oscillatory movements will meet opposition whenever they 
are in a direction either toward the adient-abient object or away 
from it, but more when toward the object than when away from 
it, as shown by the steeper difference gradient toward the double 
focal object. This means that oscillatory movements toward the 
ambivalent goal object wll be shorter than those away from it. 
Even though there will be present no forces opposed to lateral 
movements, and consequendy lateral oscillatory movements may 
be expected to be greater on the average than either forward or 
backward movements, any considerable movement at right angles 
to the path originally taken toward the ambivalent goal object 
must move a^vay from O 1 O 2 , which \vill oppose the positive or 
adient gradient difference. This is to say that lateral movements 
from the original path toward O 1 O 2 must maintain such a distance 
that the adient-abient gradient difference ^viIl always be zero. 
Consequently, all lateral movements must tend to be circular, ^vith 
a radius equal to the distance from the double object to the point 
of zero difference in reaction potential. The locus of such lateral 
movements is shown in Figure 57. 

Generalizing on the above considerations wc arrive at our next 
two theorems: 

THEOREM 86. Other things equals tenth moderately sophisticated 
subjects^ when an adient object and an abient object occupy nearly 
the same point in space and the maximum abient reaction potential 
is greater than the maximum adient potential, there will be a point oj 
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stable equilibrium at a j.n.d. distance from the adUnt-abient object 
amounting to 


d = 


log^ 
/ ~ j ’ 


THEOREM 87. Other things equal, taken an adient-abient object 
occupies the same point in space and the maximum abient reaction 
potentia is greater than the maximum adient, the oscillatory move- 
ments from the point of zero difference away from the object will be 
5 on t average than those toward the object, and those in a 
lateral dmcU.n will b, greater on tht avorago than oilhor, tho la, lor 

rM «"«<■ and the double object being the 

center w„h a radws equal to the distance Jrom the point of zero difference. 

tier of empirical observation that ambivalent situations 


of this kind, in which the 


abient reaction potential is greater at its 




-KJ)'' 


10 ’ 




under the conditions of equilibrium of a^c^r ^“tcral movements must tab 

space. Reproduced from Hull ( 7 , p 290)“ at a point ir 

maximum than the adient 

present points of stable cquili?°" at its maximum, dc 

oscillatory movements whose * rf^***’ • roughly circular 

lend to be made by naive orea^^ ^ equilibrium distance 

tion theoretically and came considered this situa- 

Apparently he also made emnirio conclusions (70, p. 96 ). 

P rical observations on young children 
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which agree with the theory. His concept of field vector here 
corresponds roughly to our spatially generalized reaction potential. 
Theorem 87 may accordingly be said to have some empirical 
substantiation. 

Now let it be supposed that the motivation or the incentive of 
the adient reaction potential is increased. An increase in either of 
these will increase the value of the maximum reaction potential 
bEr in equation 57. Let us suppose that this is changed from 3 . 0 <t 
to 3.5(7. Substituting in equation 57, we then have, 


d 


I 4.0 

j' - j 

log 1.143 

.02 - .01 


d 


.05805 

.01 

5.80 j.n.d.’s, 


which, since 5.80 < 12.48, indicates that the distance will be 
reduced. 

In a similar manner, in case the adient motivation or incentive 
is reduced so that the reaction potential falls from 3.0(7 to 2.5(7, we 
have. 


d 


, 4.0 

‘°e23 


j' - j 

log 1.6 

.02 - .01 


,20412 

.01 


d = 20.41 j.n.d.’s; 


i.e., since 20.41 > 12.48 the distance of the point of equilibrium 
will be increased. 

In an analogous manner wc find that if the abient motivation 
is increased, this will increase the maximum abient reaction poten- 
tial from, say, 4.0(r to 4.5(r. Substituting in equation 57, wc have. 
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d = 


log — 
® 3.0<r 


j -j 

^ log 1.5 
.02 - .01 
_ .17609 
.01 

d = 17.61, 

^ 12.48, that the equilibrium distance 
In a abieut motivation. 

Theorem nbient motivation is decreased, by 

Suppose this f^fmm 4 0 mTj Thefr" 

•u to ihen, by equation 57, we have, 
logl^ 

d = -tMf 
j -j 

= log 1.167 
.02 - .01 
,, .06707 
.01 

d = 6.71 j.n.d.’s, 

which shows, since 6.71 < 12 4R .V . 

be decreased by a decren? * u ’ equilibrium distance will 

Generalizing from the^ *k * ^ of the abient motivation, 

following two theorems; ^ 'considerations we arrive at the 

theorem 88 Other th' 

M,nt object an an adient object and an 

Po‘entiol ie appretZ I''’ Ttiaximum abient reaction 
potential, an increase in th Maximum adient reaction 

{D) or both will decrease *^ontioe {V) or primary motivation 

the objects, and a reduction f the point oj equilibrium from 

will increase the distance «/ tu ^ incentive or motivation or both 
theorem 89. 

abient object are combined sb^r^n°^' ^hen an adient object and an 
potential is appreciably pre z * "a Maximum abient reaction 

potential, if an increase is rtmd ^ ‘ f^oximum adient reaction 

will be an increase in the di^ ^ obient motivation (D) there 
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objects^ whereas if a reduction is made in the abient motivation there 
will be a decrease in distance of the point of equilibrium. 

Wc are fortunate in having available convincing experimental 
results bearing directly on the preceding theorems. Miller, Brown, 
and Lipofsky (74) trained albino rats to perform an adient reac- 
tion in an enclosed alley by feeding them at an end of the 
alley marked by a small light. They then built up an opposing or 
abient reaction by giving the animals electric shocks while eating. 
The results of this training were recorded by means of a light-weight 
cord attached to a little rubber harness placed on the animals 
immediately preceding the tests, the latter being given without 
shock. In the tests, as in the training trials, the animals were always 

TABLE 32. Summary of the outcome of the Miller, Brown, and Lipofsky experiment 
as reported by Miller {13, p. 437). 


Change in motivation 

Effect of motivation 
change on distance of 
point of equilibrium 
from adlent-abient 

Adient motivation 

Abient motivation 

object 

increase 

constant 

decrease 

decrease 

constant 

increase 

constant 

Increase 

increase 

constant 

decrease 

decrease 


placed at the beginning of the alley, i.e., the end opposite that at 
which they were fed. Four gfroups of animals were employed, in 
each group of which the intensity of one of the motivations was 
varied with the other intensities remaining constant. The several 
conditions, together with the effect on the distance of the point of 
equilibrium from the food-shock end of the alley, were as shown in 
Table 32. Thus Theorems 88 and 89 find complete verification so 
far as the primary motivation factor is concerned. The matter of 
incentive is left unverified. 

Behavior Potential Fields, Barriers, ond the Purely Spatial Habit-Family 
Hierarchy 

In our account of abient-abient interaction we had occasion to 
assume a situation in which the organism was placed within a 
narrow alley. The organism could not escape from this hypothetical 
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alley because it consisted of a complex enclosing barrier. Since in 
our further consideration of adience fields the matter of barriers 
will frequently be encountered, we shall now pause to consider the 
essential characteristics of barriers as such. Perhaps the most sig- 
^Jarrier with which to begin this discussion is that 
which IS relatively transparent but impossible of penetration, such 
s a g ass wall, a strong wire screen, a set of bars, or an obstacle 
over w ic an adient object may be seen, say, but which cannot 
easily be surmounted, 

toward *** !i°^ ^^^mple, the case of a naive organism moving 
barrier of" through a glass 

line of vision TV, Perpendicular to the organism’s 

so received w ll h complex relative to the adient object 

if the barrier ' ' ^ ^'ightly different from what it would be 

re at very" 'he afferent interaction will be 

in a stra^ITl 1 ' ^ l°eomotion will proceed 

rat^, unlu thl glasr * ^h^htly reduced 

unsophisticated it will adraTCe"^'fI'^ ‘n' completely 

which will ( 1 ) bring it to a hT'‘ ‘mpinges on the glass, 

change in the sfr^.e.r« r discontinue the progressive 

inereLirg 

cause the occurrence or intnl"^®' °hjeet), and (2) 

cutaneous and even in;,. • ^ radically different set of 

seating needs) will be redu^M 'I!"" n (repre- 

from the barrier, which ' Y ^he subject’s reflex withdrawal 

withdrawal habits to the^b^*^^” ''"*** conditioned abient or 

been intense this abienre ^ ^ stimulus. If the injury has 

But, since the barriers ' “"'‘Werable. 
injury except that receiv^d^/'^'"' “hj""' nnd will cause no 

that with repeated stimuIat^T a considerable impact, it follows 
abience will become proerM ' ^ ^corem 58) the gradient of 

be practically vertical- ie steeper until ultimately it will 

extent that accidental’ contarT avoided only to the 

occur if the barrier is not such ° occur, and even these may 
This is why, generally soeaW^* n^ake mere contact injurious, 
organs avoid barriers on the^K* P^Sanisms with normal sense 
rarely come into physical cont distance receptors and 

abient distance is ordinarily minT though the limiting 
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Generalizing on the preceding considerations we formulate our 
next theorem: 

THEOREM 90. Smooth and strictly static absolute barriers are abient 
objects, the reaction potential gradients oj which normally attain early 
in the organism^s interaction with them a practically vertical degree of 
steepness near d = 0. 

As the next step in our analysis of organismic behavior toward 
abient barriers, let us consider the organism’s discrimination of 
the visual image of the adient object as it appears through the 
barrier, and the image as it appears without the intervention of 
the barrier. In this connection the reader will need to recall the 
fact of discrimination learning (Chapter 3) and especially the 
principle of pattern discrimination, which are here assumed with- 
out further comment. As a result of maximal stimulus pattern 
discrimination acquired in conjunction with the process known as 
compound trial-and-error learning (Chapter 6), the organism 
will halt its adient locomotion as soon as the distinction between 
the image of the adient object alone and that of the adient object 
seen through the transparent abient object (barrier) becomes great 
enough, since the combined stimulus pattern has become an in- 
hibitory stimulus for that act. Moreover, such situations in the 
past have, through compound trial-and-error learning, set up 
exploratory receptor-exposure acts. These latter acts may reveal 
free space a little to one side of the barrier. Locomotor trial and 
error, originally occurring on the basis of the oscillation function, 
will lead the organism far enough to one side of the barrier for the 
reception of an unobstructed visual image of the adient object, 
whereupon a new adient gradient within the adient reaction 
potential Held of the organism will evoke uninterrupted locomotion 
to the adient object and to consequent further reinforcement. 

Generalizing from the above considerations we arrive at our 
next two theorems: 

THEOREM 91. Whtn an adient object and an abient object are 
combined in the same situation and stimuli normally evoking adient 
or abient behavior in open space are conjoined with other stimuli 
which arise from an abient object {barrier), the resulting stimulus 
pattern will check the adient or abient behavior otherwise initiated or 
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alley because it consisted of a complex enclosing barrier. Since in 
our further consideration of adience fields the matter of barriers 
will frequently be encountered, we shall now pause to consider the 
essential characteristics of barriers as such. Perhaps the most sig- 
nificant type of barrier with which to begin this discussion is that 
which IS relatively transparent but impossible of penetration, such 
as a glass wall, a strong wire screen, a set of bars, or an obstacle 
over w ich an adient object may be seen, say, but which cannot 
easily be surmounted. 
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the oscillation function, lead the organism to perform the detour or 
umweg. From this, together with the principle of less work (5, p. 293) 
and of the gradient of delay in reinforcement (iii A or iii B), it 
follows that the organism will tend to prefer the alternative course 
or path involving the shorter distance of locomotion (and less work), 
and therefore will choose it more frequently than the longer path 
around the farther end of the barrier. An S -■* R diagram showing 
the behavior theory of the two paths is shown as Figure 58. Assum- 
ing that the short path is two seconds in duration and that the long 
path is five seconds in duration, the short goal gradient figures out, 
old style, at 3.155<r and the long one at l.SSltr. The reader will 
note that since the two alternative paths terminate at the same 
point in space, they constitute a special case of the habit-family; 
and since the shorter path is normally preferred to the longer one, 
the two constitute a hierarchy — the smallest number possible. We 
accordingly have here a special and limiting case of the habit-family 
hierarchy^ a secondary principle of very wide application about which 
we will hear more presently. 

Generalizing on the preceding considerations we arrive at our 
next theorem: 

THEOREM 95. Other things equals organisms which are presented 
with alternative paths in detouring about a barrier to an adient object 
will learn to prefer the one involving the shorter distance. 

The validity of Theorem 95 is attested by general observation. 

At this point we must consider with some care a secondary 
principle of major behavioral importance. This is the principle of 
the habit-family hierarchy or motor equivalence to which we re- 
ferred immediately above. The general concept of the habit-family 
hierarchy is this: when a single locomotor path habit is set up, it 
involves an infinite number of potential paths in free space, all 
terminating at the same goal point. Because of the principle of 
less work, the shortest and less laborious of these potential paths 
will be preferred to the others of the hierarchy. In case the first 
path found to a given goal is indirect, i.c., circuitous, the organism 
will naturally tend automatically to shift to the shortest or most 
preferred path available. Our present task is to try to understand 
how this automaticity comes about. 

The discussion which led to the formulation of Theorem 9 , 
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partially evoked and then give rise to visual and other exploratory 
behavior. 

THEOREM 92. When an adient object and an abient object ate 
combined in the same situation and a barrier stimulus pattern has 
given rise to exploratory behavior which reveals open space at one 
side oj the barrier^ trial-and-error behavior will lead to a detour, the 
unimpeded view which results from this activity serving as a secondary 
reinjorcement of the detour behavior, after which the adient or abient 
behavior will continue from the new position at one side of the barrier. 


adient behavior converges in general toward 
the adient object it is evident that after the detour the direction 
01 the path will again turn toward the adient object. In the case 
Qwever, since the latter field usually radiates in every 
will *11"^ abient object, the organism following a detour 

the barrier^hi^t toward the path interrupted by 

which it would iT* ^ <^irection which diverges from the one 
Which It would have taken except for the barrier 
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rounded ln\bLt ^biiciTh rrdient behavior has 

receives an unr {Corner), the organism will, even before it 

the normal adi stimulus of the adient object, resume 

ihi ba,rL tTfhkhfC'^h- objtcl from that point 
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out above, locomotor trlnl ** as from the other. As pointed 
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and particularly to Figure 58, will aid materially in this understand- 
ing. A perusal of this figure and the associated discussion will 
suggest some of the principles which are operative in this situation. 
For one thing, optical fixation (oSf) is an important factor, as are 
the other external stimuli (eSi, eSj, etc.) both optical and other- 
wise. Riesen’s classical study of visual perception (75) strongly 
suggests that the meaning of visual fixation stimuli is acquired, at 
least by anthropoids, very early in life. As a result, optical con- 
vergence in fixation yields an indication of the distance of seen ob- 
jects. At the same time the size of the optical image in conjunction 
with the degree of optical convergence (distance) indicates the 
size of the object. Also the angle of the point fixated shows the 
direction of the goal. And the matters of image size and of the in- 
tensity of optical convergence introduce the principle of stimulus- 
intensity generalization. In addition to the above primary stimulus 
or perceptual principles, there is an important secondary principle 
known as the gradient of reinforcement, J, (iii A, or iii B). This 
has been utilized above (Chapter 5) through the mediation of the 
fractional antedating goal reaction, chiefly through stimulus gener- 
alization on the perseverative stimulus trace as a continuum. In 
the present situation the stimulus generalization is conceived to 
operate on the basis of optical fixation stimuli and their traces. 

In considering the transfer of training from a long or indirect 
path to a short or direct one we present two main cases. They are 
represented diagrammatically in Figure 59 A and 59 B. We will 
first take up the simpler case seen in Figure 59 A. Let us assume that 
the original habit was set up in relatively free space substantially 
like the long sequence of Figure 58, and that the shortest path in 
this free space is a straight line normally requiring only one second 
to traverse; this is considerably shorter than the short sequence oj Figure 58. 
Now this amount of delay in reinforcement would yield a reinforce- 
ment gradient value of 3.971<r, on the assumption that under the 
current reward conditions zero seconds* delay would yield S.Oa. 
We also assunie that the goal object was visible from the starting 
point but was not known to be a goal until after it was found and 
the reinforcing substance (K') was consumed. Under these condi- 
tions Ri will have its full strength of 1.581(r, since all of its bonds 
at the outset of the five-second (longer) path (Figure 58) are 

operative. 
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and particularly to Figure 58, will aid materially in this understand- 
ing. A perusal of this figure and the associated discussion will 
suggest some of the principles which are operative in this situation. 
For one thing, optical fixation (oSf) is an important factor, as are 
the other external stimuli (eSi, eS 2 , etc.) both optical and other- 
wise. Riesen’s classical study of visual perception {15) strongly 
suggests that the meaning of visual fixation stimuli is acquired, at 
least by anthropoids, very early in life. As a result, optical con- 
vergence in fixation yields an indication of the distance of seen ob- 
jects. At the same time the size of the optical image in conjunction 
with the degree of optical convergence (distance) indicates the 
size of the object. Also the angle of the point fixated shows the 
direction of the goal. And the matters of image size and of the in- 
tensity of optical convergence introduce the principle of stimulus- 
intensity generalization. In addition to the above primary stimulus 
or perceptual principles, there is an important secondary principle 
known as the gradient of reinforcement, J, (iii A, or hi B). This 
has been utilized above (Chapter 5) through the mediation of the 
fractional antedating goal reaction, chiefly through stimulus gener- 
alization on the perseverative stimulus trace as a continuum. In 
the present situation the stimulus generalization is conceived to 
operate on the basis of optica! fixation stimuli and their traces. 

In considering the transfer of training from a long or indirect 
path to a short or direct one we present two main cases. They are 
represented diagrammatically in Figure 59 A and 59 B. We will 
first take up the simpler case seen in Figure 59 A. Let us assume that 
the original habit was set up in relatively free space substantially 
like the long sequence of Figure 58, and that the shortest path in 
this free space is a straight line normally requiring only one second 
to traverse; this is considerably shorter than the short sequence oj Figure 58. 
Now this amount of delay in reinforcement would yield a reinforce- 
ment gradient value of 3.971<r, on the assumption that under the 
current reward conditions zero seconds* delay would yield S.Otr. 
We also assume that the goal object was visible from the starting 
point but was not known to be a goal until after it was found and 
the reinforcing substance (K') was consumed. Under these condi- 
tions Ki will have its full strength of LSSlc, since all of its Iwnds 
at the outset of the five-second (longer) path (Figure 58) arc 
operative. 
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and particularly to Figure 58, will aid materially in this understand- 
ing. A perusal of this figure and the associated discussion will 
suggest some of the principles which are operative in this situation. 
For one thing, optical fixation (oSf) is an important factor, as are 
the other external stimuli (eSi, eS2, etc.) both optical and other- 
wise. Riesen’s classical study of visual perception {IS) strongly 
suggests that the meaning of visual fixation stimuli is acquired, at 
least by anthropoids, very early in life. As a result, optical con- 
vergence in fbcation yields an indication of the distance of seen ob- 
jects. At the same time the size of the optical image in conjunction 
with the degree of optical convci^ence (distance) indicates the 
size of the object. Also the angle of the point fixated shows the 
direction of the goal. And the matters of image size and of the in- 
tensity of optical convergence introduce the principle of stimulus- 
intensity generalization. In addition to the above primary stimulus 
or perceptual principles, there is an important secondary principle 
known as the gradient of reinforcement, J, (iii A, or iii B). This 
has been utilized above (Chapter 5) through the mediation of the 
fractional antedating goal reaction, chiefly through stimulus gener- 
alization on the perseverative stimulus trace as a continuum. In 
the present situation the stimulus generalization is conceived to 
operate on the basis of optical fixation stimuli and their traces. 

In considering the transfer of training from a long or indirect 
path to a short or direct one we present two main cases. They are 
represented diagrammatically in Figure 59 A and 59 B. We will 
first take up the simpler case seen in Figure 59 A. Let us assume that 
the original habit was set up in relatively free space substantially 
like the long sequence of Figure 58, and that the shortest path in 
this free space is a straight line normally requiring only one second 
to traverse; this is considerably shorter than the short sequence of Figure 58. 
Now this amount of delay in reinforcement would yield a reinforce- 
ment gradient value of 3.971<r, on the assumption that under the 
current reward conditions zero seconds’ delay would yield 5.0<r. 
We also assume that the goal object was visible from the starting 
point but was not known to be a goal until after it was found and 
the reinforcing substance (K') was consumed. Under these 
tions Ri will have its full strength of 1.581<r, since all of its bonds 
at the outset of the five-second (longer) path (Figure 58) are 
operative. 
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tion of oSj), Sd = hunger drive stimulus, Rt, Ri etc., and Ri, Rn, etc., arc the locomotor acts of the respective paths, ra is the frac- 
tional antedating goal reaction assumed to be common to the two paths, and $o ts the common goal stimulus. 
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ditions the short path, Ri, which has not yet been traversed or 
reinforced in this situation, will be preferred over Ri which has 
been traversed and directly reinforced. 

We next pass to the second and far more extreme situation of 
59 B; here not only Sd is lacking but oSr also is lacking, except for 
its appearance through secondary reinforcement as in Hebb’s box- 
opening example (5, pp. 153-155). Assuming in this case that bond 
eSi To will be sufficiently strong to evoke to, So after a little delay 
will in turn tend to evoke both Ri and Ri. Now t\vo full bonds to Ri 
would yield 1.054£r, and one full bond to Ri would yield 1.3240-. 
Probably both values would be less than these figures indicate, 
though it is believed that the actual outcome would be in roughly 
the same proportion. But 1.324o- > 1.054o-. Therefore once again 
the short path, Ri, which has not been traversed or directly rein- 
forced in this situation, will be preferred to Ri, which has been 
traversed and directly reinforced. 

Hebb seems to agree with this general approach to response 
variability, though he appears to feel that the variability in the 
stimuli presents a serious difficulty in the theory (5, p. 155). The 
oscillation function, it will be recalled, necessarily requires that 
the tq of eating, which is the goal here, will vary over a small zone. 
But Hebb seems to forget that stimulus generalization should easily 
be able to bridge these small deviations in So. A very similar general- 
ization bridge over oscillatory variability has been explained rather 
elaborately in another work (5, pp. 194-196). 

Generalizing from the above considerations, we arrive at our 
next theorem: 

theorem 96 . W/ien an indirect member of a locomotor habit- 
family hierarchy has attained a goal in a novel situation involving 
relatively free space, the learning then acquired is transferred to the 
initial segments of remaining members of the hierarchy without specific 
practice and on subsequent trials is manifested by the organism’s 
spontaneous choice of the most direct path. 

General observation confirms the validity of Theorem 96 very 
fully. In addition, the classical maze work of Dashiell (2) confir^ 
this theorem experimentally in an elegant manner (see Figure 66, 
Chapter 9). 
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On the other hand, one of the original three bonds leading to 
the short path (Ri) will be partially lacking, since from the starting 
point the stimulus of the view of the short path should be somewhat 
different from that of the view of the long path, which has been 
reinforced. This difference should reduce by an uncertain amount 
the generalization from ^Si -»R, to S, -> Rr. On the other hand, 
t e generalization from oSr-»R, will be practically complete to 
oSr --> R,, since the goal object will be the same except for size 
and the size will be larger on the transfer. Finally, the drive stimuli 
(br») will be strictly identical in the two cases.^ If we assume that 
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of the several paths between S and G in Figure 60 will be associated 
with the angular deviation of the initial segment of each possible 
path from a direct line to the goal. As a result of this previous habit 
formation, the organism will, without additional training, come 
to prefer the following hierarchy: path SAG, path SBG, paths SCG 
and SDG, and, last of all, path SFG. 

From these considerations we arrive at our next theorem: 

THEOREM 97. Other things constant^ the various possible alternative 
potential paths in free space Jrom a starting point to an adient goal 
will tend without additional special training to create reaction potentials 
which are jointly a Junction of the strength of the reaction potential 
for the direct path and an inverse Junction oj the magnitude oj the angle 
that the beginning oJ each potential path makes with a straight line 
connecting the starting point and the adient goal object. 

Since reaction potential as such cannot be directly observed, at 
least by another organism, Theorem 97 cannot be tested experi- 
mentally. However, it should be possible to test it by means of the 
latencies (XIV) of the acts the organism performs in taking the 
several paths. 



*'*oxjre 61. Diagrammatic representation of a detour or "Umueg" situation caused by 
® U-shaped barrier placed in the direct path of an organism at S with its goal at G. 
The goal object is supposed to be visible but the barrier, impassable (7, p. 281). 

Let it be supposed, now, that an organism possessing this spatial 
habit-family hierarchy is placed behind the U-shaped barrier repre- 
sented in Figure 61 in such a way that it can optically fixate the 
adient goal object, e.g., through bars, but cannot go directly to G 
by reason of the barrier. According to the habit-family hierarchy 
the excess of the reaction potential toward SAG over SBG will 
cause various exploratory movements into alternative paths or 
subpaths closely resembling SAG in general, in the order of the 
reaction potential of each pathos initial segment. But since these 
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The Angle That the Beginning of a Path Makes with the Direct Line to on 
Adient Goal and Us Influence on Initial Reaction Potential 
At this point we turn to the question of the relative reaction poten- 
tials possessed by the various members of a naive organism’s spatial 
habit-family hierarchy as based on the angle that the initial segment 
of a certain path makes with a direct line to an adient goal. Con- 
sider the starting point (S) and the goal point (G) in Figure 60. 



r I o u R E 60. Diagrammatic representation of the typical mean lengths of various path- 
way, belonging to a purely spatial habit-family hierarchy whose beginnings diverge by 
different amounts from a straight line between the starling point (S) and the adient 
goal object (G). Reproduced from Hull (7, p. 284), 

Now, wc may assume that from much goal seeking and from en- 
countering various barriers to the goals in the past, normal loco- 
motor organisms will have found by trial that as a rule paths which 
make an angle with the direct path, SAG. will require more locomo- 
tion and time to reach G by SBG and SCG than by SAG itself. 
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of the several paths between S and G in Figure 60 will be associated 
with the angular deviation of the initial segment of each possible 
path from a direct line to the goal. As a result of this previous habit 
formation, the organism will, without additional training, come 
to prefer the following hierarchy: path SAG, path SBG, paths SGG 
and SDG, and, last of all, path SFG. 

From these considerations we arrive at our next theorem: 

THEOREM 97. Other things constant^ the various possible alternative 
potential paths in free space from a starting point to an adient goal 
will tend without additional special training to create reaction potentials 
which are jointly a function of the strength of the reaction potential 
for the direct path and an inverse function of the magnitude of the angle 
that the beginning of each potential path makes with a straight line 
connecting the starting point and the adient goal object. 

Since reaction potential as such cannot be directly observed, at 
least by another organism, Theorem 97 cannot be tested experi- 
mentally. However, it should be possible to test it by means of the 
latencies (XIV) of the acts the organism performs in taking the 
several paths. 



PiouRE 6l, Diagrammatic representation of a detour or "Umvtg" situation caused by 
a U-shaped barrier placed in the direct path of an organism at S with its goal at G. 
The goal object is supposed to be visible but the barrier, impassable (7, p. 281). 

Let it be supposed, now, that an organism possessing this spatial 
habit-family hierarchy is placed behind the U-shaped barrier repre- 
sented in Figure 61 in such a way that it can optically fixate the 
adient goal object, e.g., through bars, but cannot go directly to G 
by reason of the barrier. According to the habit-family hierarchy 
the excess of the reaction potential toward SAG ov’cr SBG iWJI 
cause various exploratory movements into alternative paths or 
subpaths closely resembling SAG in general, in the order of the 
reaction potential of each path’s initial segment. But since these 
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much preferred members of the habit-family hierarchy do not lead 
to the goal they will gradually be extinguished (6, p. 139; 7, p. 

When all have been extinguished to a point below the reaction 
potential of the first possible real path, SBG, this will be taken. But 
experimental extinction requires a certain amount of time {.S, 
pp. 258 ff.). fy 

From these considerations wc arrive at our next theorem i./, 
p. 281): 


THEOREM 98. Other things constant and no additional motivations 
presenty a spatially naive organism oriented to a given goal will, when 
finding itself behind a U-shaped barrier in the direct path, spend some 
time in efforts to reach the goal by paths through the barrier deviating 
progressively more from a straight line to the goal before these tendencies 
are experimentally extinguished, when a really possible path around 
the barrier will be taken. 


No directly relevant empirical evidence bearing on this theorem 
has been found, though general observation makes its soundness 
highly probable. 

Let it be assumed that an animal is placed behind one of two 
U-shaped barriers to a goal object, such as that shown at the 
left in Figure 61, except that the backward-turned arms of the U 
of one barrier are appreciably shorter than those of the other. Now, 
the shortening of the arms of the one U-shaped barrier will make 
smaller the angle drawn from the subject’s stance to the tip of the 
barrier arm as seen in conjunction with the straight line SAG, than 
would be the case if the arms were longer. But, by Theorem 97, the 
greater the visual angle the initial segment of a potential path 
makes with a direct line to the goal, the weaker will be the bEr to 
taking the potential path; and the weaker the reaction potential 
to the potential act is, the shorter will be the time required for the 
experimental extinction of the direct path down to the potential 
path level. Therefore the less will be the time required for a naive 
organism to extinguish the search for shorter and more favored 
paths to that goal. 

From these considerations we arrive at our next theorem (7) : 

THEOREM 99. Other things constant and no additional motivation 

present, a naive organism oriented to a given adient goal, when finding 
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itself behind a U-shaped barrier which it can see through but cannot 
surmount, will in general require a shorter time to detour this barrier 
successfully if the backward turning arms of the U are short than if 
they are long. 

No direct empirical evidence bearing on this theorem has been 
found. 

Let us assume further that eaeh of two similar organisms finds 
itself behind a separate U-shaped barrier to the same goal. In the 
ease of one organism, however, the barrier is appreciably closer 
to the goal than in that of the other, as shown in Figure 61. By 
Theorem 97, other things constant, the reaction potential at the 
beginning of SAG will, by reason of its comparative nearness to the 
goal, be greater than that at the beginning of SA'G' (iii). .Wso by 
Theorem 97, since the angle from the straight line at the beginning 
of the detour path to the goal is in both cases the same, the reaction 
potential to the detour path will constitute the same function of the 
direct path in the two cases. Suppose it is 40 per cent that of t e 
direct path and that the two direct-path reaction potentials are 
3.0<r and l.Ocr respectively. On these assumptions the initial seg- 
ment of the detour path to G would have a reaction potential of 
3.0cr X .40 = 1.2(r, whereas that to G' would have one of l.Oir 
X .40 = .4a-. This leaves to be extinguished before the detour can 
be made a difference in reaction potential between the direct line 
and the detour path of 

3.0O- - 1.2<r = 1.80-, 


in the case of G, whereas in the case of G' it will be, 


1.00(r — .40(r = .60«r. 


But, other things constant, it takes longer to extinguis a arg 
amount of reaction potential than a small amount. It s ou t e 
fore take longer to extinguish 1.8<r than .60<r. 

Generalizing on the preceding considerations, we arrive 
next theorem: 


THEOREM 100. Other things constant, "“'f ° 

will require longer to choose a detour path around a -s ape 
to a seen goal when the latter is close to the barner than when 
farther away from it. 
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Kohler reports a case bearing on Theorem 100. A Canary Isle 
bitch was standing behind a wire fence which with an adjoining 
house wall made an obstruction much like that shown in Figure 61. 
We quote from Kohler (9, p. 14): . over which food is 

thrown to some distance; the bitch at once dashes out to it, describ- 
ing a wide bend. It is worth noting that when, on repeating this 
experiment, the food was not thrown far out, but was dropped only 
just outside the fence, so that it lay directly in front of her, separated 
only by the wire, she stood seemingly helpless, as if the very nearness 
of the object and her concentration thereon . . . blocked the ‘idea’ 
of the wide circle around the fence; she pushed again and again 
with her nose at the wire fence and did not budge from the spot.” 

Now let us assume that in the situation represented at the left 
of Figure 61 we have two groups of organisms; the first group has a 
strong drive for the goal object, c.g., food, and the second group 
has a weak drive. There is reason to believe that, other things 
constant, 

bEr = bHr X D, 

Assuming that when the bHr = .80 the strong drive equals 3.00<r 
and the weak drive equals 1.80<r, we find by the above equation 
that these two drives yield the following reaction potentials: 


bEr = .80 X S.OOo- = 2.40a 
bEr = .80 X 1.80a = 1.44a. 


^so, that in all cases path SAG has three times the bEr 

that path MG has. it follows that on the average (,0„) something 
ke two-thirds of the reaction potential of path SAG must be 
extinguished before path SBG can be chosen. 

But, 

Vs X 2.40ir > K X 1.44ir. 

Moreover, as noted above, the extinction of a strong reaction 

CTherZ^ ™ thaf of aCak 

equal. Therefore, the extinction of X 2 40a 
will require more time and effort than will that of K X I 44a 

nextXmeml^r - 
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THEOREM 101. Other things constant^ the stronger the drive to a 
given goal object behind a U-shaped barrier^ the more the time and work 
which will be required by a naive organism before the occurrence of 
sufficient extinction to yield the execution of a successful detour. 

Lewin gave some consideration to this problem and based his 
conclusions, apparently, on the observed behavior of young children. 
In this connection he stated p. 83): “But if we continue to 
strengthen the valence, the solution of the task ceases to be facili- 
tated and instead becomes more difficult. The strength of the 
attraction then makes it doubly difficult for the child to start in a 
direction opposed to the field force. Instead, the child will execute 
with all its energy, affective meaningless actions in the direction 
of the valence.** We accordingly may say that Theorem 101 prob- 
ably has empirical corroboration. It may be noted that Lewin’s 
use of the expressions “valence** and “field force” corresponds 
roughly to our use of the expression “reaction potential,” and that 
his expression, “restructuring of the field,” corresponds in effect to 
the results of experimental extinction upon the preferred members 
of the spatial habit-family hierarchy. 

Again, let us assume the situation represented at the left of 
Figure 61, with two equivalent organisms facing this barrier for 
the first time. With one organism the goal object has a K value of 
.80, and with the other organism the lure (e.g., a smaller goal 
object) has a K value of .40. Both are assumed to have a primary 
motivation (e.g., hunger) of 3.0<7, and a habit strength of 1.0. Now, 
by an earlier form of equation 8, these two situations yield different 
reaction potentials as follows: 

bEr = 3.0a- X 1.0 X .80 = 2.40a. 
bEr = 3.0a X 1.0 X .40 = 1.20a. 

Here again we assume that the direct path SAG must be ex- 
tinguished to about two-thirds of its reaction potential before the 
path SBG can be chosen. 

But, 

X 2.40<r > HX 1-20O-. 

Accordingly by reasoning exactly analogous to that leading to 
Theorem 101, we arrive at our next theorem (6): 
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THEOREM 102. Other things constant and no other motivations 
present, the greater the incentive to action oj the goal object behind a 
U‘Shaped barrier, the more the time and work which will be required 
before a successful detour wilt be executed by a naive organism. 

Lewin considered this problem also. He remarked, apparently 
with empirical behavior of young children in mind, “ . . . the 
prospect of an especially intense reward . . . may impede the 
solution ...” (70, p. 84). 


Summary 


All behavior occurs in space, but certain behavior, if it is to be 
adaptive, must take place in specific geometrical relationship to 
particular objects in space. From this point of view there are t\vo 
primary but opposite relationships — that of approach or adience, 
and that of avoidance or abience. In situations in which approach 
must occur before reinforcement can take place, habits of approach 
behavior are in general set up through trial and error; and, like- 
wise, habits of avoidance behavior are set up through trial and 
error in situations in which avoidance must occur before reinforce- 
ment can take place. Both adient and abient behavior are ordinarily 
locomotor in nature and are conditioned in part to objects and in 
part to distance reception continua. Because of the generalized 
nature of locomotion and the strong stimulus generalization char- 
acteristics of objects, of distance reception continua, and of the 
proprioception of primary orientation movements, adient behavior 
and abient behavior are highly generalized in respect to both direc- 
tion and distance. 


Adient behavior and abient behavior both have gradients of 
reaction potential which are high near the objects in question and 
ec me wit istancc from the objects, probably roughly according 
o a negative ^owth function. This function is generally character- 
istic of both (1) stimulus generalization {8, p. 185) presumably 
operating mainly on the basis of distance reception contLa, chiefly 
visual, where space is unobstructed, and (2) the gradient ^f delay 
of reinforcement (in A and B) or goal gradient (5, pp. 135 fF.), 
presumably operating exclusively where the focal obj^ect is not 
available to any distance receptor. Owing to the process of dis- 

the equation for abient behavior to 
static objects or stimuli is ordinarily steeper than that of the 



BEHAVIOR IN SPACE 


269 


equation for adicnt behavior. Chiefly because of the principle of 
less work, the paths of both adicncc and abience will tend strongly 
to laterally straight lines. 

The organism’s approach to an adicnt object in free space may 
obviously occur from any or all directions; these several adicnt 
paths naturally converge. Withdrawal from an abient object in 
free space may obviously be in any direction and these several 
abient paths naturally diverge. Accordingly both adience and 
abience, which at first glance appear to be gradients with simple 
linear bases, actually when considered comprehensively involve 
areas, i.e., two-dimensional space at the least. The theory of adient 
and abient behavior thus involves examples of bona fide field theory, 
though this theory must not be confused with physical field theories, 
from which the present theory differs in most respects. The organism 
in behavior field theory corresponds to the particle subject to im- 
pulsion in physical field theories, and the energy involved in the 
transition in space arises in the main from the food eaten by the 
organism, rather than from the field. 

Much of the available theory of adience and abience concerns 
the interaction of these behavior potential fields. In general, where 
two adient fields are in competition, the organism will choose the 
nearer adient object; and the greater the difference in the distances 
between the objects, the greater is this probability. In a clearly 
analogous manner, the choice time or reaction latency is likely 
to be greater, the less the difference in distance between the compet- 
ing adient objects. Similarly, reaction latency is likely to be reduced 
by either an increased motivation (D) or an increased incentive 
(K), especially where one of the competitors is favored by the 
differential drive or incentive, though this is not a necessary 
condition. 

The interaction of two abient fields of reaction potential has two 
main cases: that in which the organism is in a restraining alley 
with an abient object at either end, and that in which the organism 
is placed in free space on a line between the objects. In the case of 
the restraining alley with duplicate abient objects at either end, the 
organism tends to move from the neighborhood of either (abient) 
object to a point midway between them where the difference in 
reaction potential is zero. The closer the organism is to either 
abient object, the faster will be the movement toward the point of 
zero reaction difference, and the more certain the movement is to 
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occur. In case one of the abient objects has greater drive than the 
other, the increased reaction potential of that gradient will cause 
a displacement of the point of zero reaction potential difference 
away from that end of the alley. In case the organism is placed at 
the point of zero reaction potential difference between two abient 
duplicate objects in open space, the action of behavioral vector 
summation based on small unbalancing movements due to the 
oscillation factor will generate a lateral movement to one side or 


the other at right angles to the line connecting the objects. 

The interaction of an adlent and an abient reaction potential 
field has two cases. When the organism is placed between an adient 
and an abient object, both reaction potentials lead to movement 
toward the adient object. The joint reaction potential is large 
close to each of the objects, but tends to sag to a minimum at a 
point between them. A second case of this type of interaction is seen 
where both the adient and the abient object occupy practically 
the same point in space. The interaction of the two fields ordinarily 
results in a zero reaction-potential difference, a state of so-called 
stable equilibrium, at a point some distance from the combined 
objects. 


Static barriers encountered by organisms are abient objects 
for which the gradient has become maximal in steepness through 
differential reinforcement, so that the organism merely avoids 
rough contact with the object. The reactions of organisms to simple 
barriers m these adient and abient fields are complicated by stim- 
ulus pattern discrimination set up on the basis of compound trial- 
As a result of this process, sophisticated organ- 
sms will not attempt to surmount really impassable barriers but 

path eith^ , by taking the shortest 

The variom It ‘ ^ Abient object, 

hiit famTv h r' constitute 


Terminal Notes 


HISTORICAL NOTE 


The facts of adience 
that they cannot be 


” animal behavior 
overlooked. Adience has been widely employed 
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by animal psychologists as an indicator of the results of learning 
in the greatest variety of situations. Unfortunately this has been 
done with little or no explicit recognition of the inherent complexi- 
ties involved in the process itself. It is believed that this is the reason 
for some of the theoretical confusion regarding maze learning. 

The first important publication in the field of the behavior of 
organisms toward objects in space was by Lewin in 1933. An 
amplification of substantially the same material was published as a 
book (70) in 1935. These works presented an exceedingly valuable 
analysis of the general field, and raised at a qualitative level a large 
number of the problems concerning behavior toward objects in 
space which have occupied the attention of subsequent workers, 
even though Lewin himself seemed not to have been much inter- 
ested in the spatial problems as such. 

In 1938 the present writer published a manuscript (7) written in 
1934, which attempted to apply a quantitative mathematical 
analysis to some of these problems, in particular to those involving 
the goal gradient hypothesis. Since the manuscript was already 
written on the basis of the by-then-abandoned (7, p. 273) logarith- 
mic formulation of the goal gradient, this form of the hypothesis 
appears in the published study. This article gave what is believed 
to be the first quantitative mathematical derivation of the problem 
of adient-abient equilibrium. It also gave quantitative analyses of 
several forms of the barrier problem. The author s present view 
is that these latter analyses are defective in that the principle of 
afferent stimulus interaction and stimulus patterning was not 
employed {8, pp. 349 ff.). 

Around the year 1940, Neal E. Miller, in association with 
Judson S. Brown and several others, began an exceedingly sagacious 
and ingenious experimental attack on this series of problems, em- 
ploying albino rats as subjects. Fortunately as early as 1942 Bro^ 
published in detail a part of this experimental work, together with 
the important germinal idea that the goal gradi(-nt principle is not 
the only factor operating in open space. He says (7, p. 209). 

It can be shown, however, that a number of these facts are 
also in accord with the concept of the spatial generalization oj 
conditioned responses. 

In the opinion of the present writer, the principle just quoted con- 
stitutes the most important single advance recently ma e m t is 
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add. As the reader has already seen, it has been exploited on a 
large scale in the foregoing chapter. While much of the work of 
Miller and his associates had not been published, owing to the 
participation of both Miller and Brown in the war effort, Miller 
was able in 1944 to include a summary of much of it in his chapter, 
“Experimental Studies of Conflict,” which appeared in Hunt s 
Personality and the Behavior Disorders {13, pp. 431 ff.). Miller’s theo- 
retical analysis is essentially behavioristic in nature and, while 
technically qualitative in form, clearly advances the subject to a new 
high level. The experimental results are admirably quantitative. 


THE MEANING OF THE EXPRESSION “FIELD THEORY*’ 


The frequent use made in the present chapter of reaction-potential 
fields may quite naturally raise for the serious reader questions as 
to the relationship of these fields to the “field theories” and the 
“field forces” so extensively referred to in the literature of the 
Lewin branch of the Gestalt school. There is some uncertainty in 
this respect. This uncertainty has been increased by a late article 
by Lewin in which he proposed {11, p. 292) to make a final clarifi- 
cation of the subject from his point of view. In this connection he 
said (77, p. 294); 


Field theory, therefore, can hardly be called correct or incor- 
rect in the same way as a theory in the usual sense of the term. 
Field theory is probably best characterized as a method: namely a 
method oj analyzing causal relations and of building scientijic con- 
structs. This method of analyzing causal relations can be 
expressed in the form of certain general statements about the 
nature of the conditions of change. 

In the present work the expression “field theory” definitely means 
a t cory m the natural-science sense, and one which is either 
true or false in the usual meaning of the term. Moreover, field 
eory as here used is concerned with action potentialities in spac. 
IS, It IS believed, is the ordinarily accepted use of the expression 
in works on physics such as that by Lindsay and Margenau, where 
various sorts of physical fields are dealt with and where, for exam- 
pie, we find the expression (72, p. 283): 

a continuous region of space at every 

stanHa H . 7 act on a 

Standard particle placed there . 
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In the present behavioral field theory the organism corresponds 
to the particle, and it is supposed to move in true space, but there 
the analogy to the fields found in physics largely ends. The law 
connecting a particle to the source of a gravitational field is that 
of the inverse square of the distance measured in feet or miles. The 
law relating the organism to the adient or abient object, on the other 
hand, is presumably approximately of the form, 

bEr = bEr X 10“*^ 

where d does not represent spatial distance as such, but instead 
represents j.n.d. values functionally based on distance. 
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9. Multidirectional Maze Learning 


Having considered in the last chapter the subject of organismic 
behavior in free and partially barricaded space, we may now resume 
the study of compound trial-and-crror learning with an increased 
capacity for understanding. Specifically, wc propose to consider 
the learning of the ordinary maze — one of the classical problems of 
psychology. But before wc proceed to the investigation of this major 
subject we need to examine one or two principles concerning a 
special type of problem which arises in the learning of what we 
shall call the alternative-path maze {10, p. 26). 

A simple form of this type of maze consists of two distinct and 
symmetrical pathways extending from a common starting point 
(S) to a common ending point or goal (G) where, usually, food is 
found. This is illustrated in Figure 62. For purposes of exposition 
these pathways are divided into equal units of distance separated 
by broken lines; the shorter path, yy', thus has four units of length, 
whereas the longer path, xx', has eight units of length. Now it is 
known on the basis of an ample series of experiments, beginning 
with a study by DeCamp {3) and culminating with studies by 
Yoshioka (31) and Grice (6), that upon the whole if a hungry 
organism is given alternating rewarded trials on two paths of 
this general nature it will at length, when given free choices, come 
to take path yy', the shorter of the two. The experiments by 
Yoshioka and by Grice have also shown that the comparative ease 
with which the organism will learn to choose the shorter path is a 
function of the relative length of the two paths, rather than of the 
absolute difference between them. And Anderson has found that 
even with a period of delay substituted for the differential distance, 

275 
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the path involving the shorter delay will also come to be a preferen- 
tial choice on a relative rather than an absolute basis. 

These facts, among many others, have given support to the goal 
gradient hypothesis which the reader has had occasion to consider 
in numerous other but related connections in previous chapters 



y to / Since illustrating two goal gradients, x to x' and 

P y X to X are numbered m Arabic, and those of y to y' in Roman numerals. 


data 04 m =>" analysis of the Andersoi 

to the teL , ’ coosiderable probabilit, 

law backward from the point of reinforcement. In simple lanKuam 
this means that the streno-tK «r _ oimpic iangudg« 

• • r strength of reaction potential at one unit o 

• ““ “ 

tne strength of reaction potential at ni 
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delay; that the bEr at two units of delay from the goal will have a 
similar fractional reduction below that at one unit of delay; and 
so on through as many units of delay in reinforcement as occur. 
For example (74, p. 163), if wc take bEr = 3.120£r as the strength 
of reaction potential at the limit of training with one unit of delay 
in reinforcement, and ss the uniform factor of reduction (F), 
then the bEr at one unit of delay would be: 

3.120 - = 3.120 - .312 = 2.808. 

Similarly, the hEr at two units of delay in reinforcement would be: 

2.808 - = 2.808 - .2808 = 2.527, 

and so on. On this principle, at the limit of training the reaction 
potential to turn right at S {Figure 62), i.e., to choose path y four 
units from reinforcement, would be 2.047ff, whereas that 
left, i.e., to choose path x eight units from reinforcement, would be 
1.343(r. The difference between these two reaction potentials is, 

2,047o- - 1.343(r = .704(r. 

Now, assuming that the standard deviation of the i,Ok at these 
two points is .3012, the standard deviation of the difference of the 
two would then be, 

V.30122 ^ .3012^ = V.18156 = .426. 

Dividing the obtained difference, .704£r, by the standard deviation 
of the associated bOr, we have, 


This value of 1.652 has a functional relationship to the P™babihty 
of a correct choice being made at the ° , 

this up in an appropriate table of the Probability inte^a we fi„d 
that it corresponds to .451 + .500 = .951, or better than 

95 short-path choices in a hundred trials, say. i > 
example of the action of the goal gradient in a ®P ~ulti. 

situation before us, we may now begin the consideration of multi 
directional maze learning. 
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The Goal-Gradient Principle and the Short-Circuiting of Multidircctional- 
fAaze Blind Alleys 

The ordinary maze is often called the Hampton Court maze 
because, historically, such a maze was laid out on the grounds of a 
place known as Hampton Court* The walls consisted of high hedges, 
and guests had the amusement of finding their way out through the 
intricate passages. In the hands of psychologists during recent 
years the maze has been adapted to the greatest variety of problems 
and has taken very many forms. In Chapter 6 we studied the 
phenomena associated with one of these forms, the linear maze as 
represented in Figure 44. Being linear, the true path of this maze 
necessarily extends as a whole in a single direction; such mazes 
may therefore be called unidirectional. But in the usual type of 
Hampton Court maze the true path may, and often does, ex- 
tend in many directions; for this reason we shall call such mazes 
muhtdirechonal. 

Actually the latter type of maze is usually built on the right- 
angled principle. Such a maze may have as many as four paths 
emanating from any point. If one of these paths constitutes the 
entrance, there are three others which may serve as exits — right, 
left, and straight ahead. In order to simplify the matter of be- 
havioral interpretation somewhat, the straight-ahead path is 
frequently eliminated, leaving a T-shaped path which at each 
choice point in the maze forces the subject to turn either to the 
right or to the left. Numerous T’s joined together in various ways 
may make up a maze of any desired length and complexity. Such a 
maze, an adaptation of one used by Blodgett (7) in Tolman’s labo- 
ratory, is represented diagrammatically in Figure 63. To simplify fur- 
ther the interpretation of maze behavior, valves are often placed in 
the true path to prevent retracing and the possibility of the subject’s 
entering the same blind repeatedly. Also, curtains are often placed 
at each side of the choice point to prevent the subject from seeing 
in advance of choice what lies beyond them, e.g., the dead end of 
a blind alley. 

When first put into a maze the animal, usually an albino rat, is 
apt to be very fearful, and ordinarily crouches quietly where first 
placed for some time. However, it will at length begin to explore the 
immediate vicinity, gradually extending the range of exploration, 
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with frequent retracings to the starting point, until the entire maze 
is covered. For this reason the animal is often simply allowed to 
ekplore the maze for an hour or so on each of several days before 
the learning proper is begun and a record of behavior made. At 
that time the animal, usually very hungry, is placed in the maze at 
a given point such as that marked S in Figure 63, from which it 
wanders at will until at length it makes its way to the point marked 
G, where food is found and eaten. This constitutes a single trial. 



FIGURE 63. Diagrammatic representation of a fairly typical multidirectional maze 
made by combining five T’s. Actually the true path in this case moves only north, south, 
and cast, wth no movement west. The five choice points are indicated by Arabic 
numerals in the order of the distance from the starting point (S). The true (shorter) 
path, AC, of the final or fifth section is represented by a continuous line; the long path, 
ABC, via the blind alley is represented by a broken line. Adapted from a drawing pub- 
lished by Blodgett (/, p. 117). 

In doing this the animal will naturally enter many of the blind 
alleys. For example, instead of going directly from A to C and the 
food (Figure 63) the animal may go from A to B, turn 180 degrees 
and retrace its way back to A, and then go on to C and the food. 
This path is marked by the broken line. On successive trials the 
animal’s behavior gradually takes on a more “purposeful” appear- 
ance, the speed of locomotion increases, the number and durations 
of pauses decrease, and the number of blind alleys entered also 
gradually decreases until with most mazes and most rats no false 
locomotion at all is made. 
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In the history of behavior theory much attention has been given 
to explanations of why animals cease to enter blind alleys during 
maze learning. Certain essentially qualitative principles such as 
recency, frequency, and intensity of associated stimuli, once much 
in vogue, were early put forward following the attempts of Hob- 
house (8, pp. 174 ff.) and Holmes (9, pp. 164 ff.) to explairi learning 
in general. An example of this is seen in Lloyd Morgan’s famous 
chick and caterpillar combination and the concept of organic 
congruity and incompatibility. Watson proposed a clear but 
definitely inadequate theory of maze learning based on simple 
probability coupled with the principles of associative frequency 
and recency (28, pp. 256-269); Thorndike puzzled over how the 
pleasures (of success) are “able to burn in and render predominant 
the association which led to them,” (23); and Peterson proposed 
a qualitative hypothesis based on “completeness of response” 
coupled with association. Thus as the quantitative theory of 
behavior began to emerge, it was seen by most serious students of 
learning that simple association alone as at that time conceived 


was not adequate *.o account for blind-alley elimination. 

We shall show presently that a number of different principles 
operate in maze blind-alley elimination. In the interests of exposi- 


tory clarity we shall examine these principles one at a time. The 
first of these, as suggested in our introductory statement regarding 
the alternative-path maze, is the goal gradient (70) or the delay in 
reinforcement (iii A) hypothesis. On this analogy the blind alley 
ABC of the last T-unit in the multidirectional maze represented in 
Figure 63 corresponds in some sense to the long path in Figure 62, 
and the short path or true alternative (AC) corresponds to the 
short path in Figure 62; the former involves approximately three 
units of delay in reinforcement (and of work), whereas the latter 
involves only one unit. By means of computations exactly analogous 
to those given above for the altemative-pathway maze and summar- 
ized in the first line of Table 33, it may be seen that the short- 
circuit path from A to C will yield at the limit of training a reaction 
potential at A of 2.808a for that choice of turn, whereas the long- 
circuit path from A to B to C will yield a reaction potential at A 
of 2.274a for that choice of turn. The difference of .534a yields a 
ratio to .426 of 1.253. Reference to a table of the probability 
integral shows that this corresponds to the probability of a right- 
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hand choice at A of 89.5 per cent at the limit of practice. Thus the 
short-circuiting of a maze blind alley is to be expected theoretically 
on the basis of the goal-gradient hypothesis alone. ^ 

Generalizing on the basis of these considerations we arrive at 
our next theorem: 

THEOREM 103. Other things constant^ the goal gradient will tend 
strongly to cause the short-circuiting of errors, i.e., to cause the elimina- 
tion of the choice of blind alleys in maze learning to a suitable reinforc- 
ing agent. 

As already pointed out, the fact of blind-alley elimination was 
well known empirically long before the goal-gradient hypothesis 
was formulated, 

TABLE 33. Systematic presentation of the theoretical probabilities of the correct over 
the incorrect choice at typical choice points throughout a 19-blind-alIcy maze on 
the assumption that the goal gradient is the only factor operating (which is not so) 
and that the oscillation factor (cgoe) ^ -3012 {14, p. 163). But, 


V.3012* .3012* = V'.18156 » .426. 






Difference 

Probability 





divided by 

of choice of 

No. of 

Reaction 

Reaction 


.426, the 

correct path 

choice 

potential 

potential 

Difference 

square root 

by table of 

point 

to true 

to blind 

in favor 

of the sum 

the proba- 

counting 

path 

alley 

of correct 

of the two 

bility 

from goal 

choice 

choice 

choice 

bOr’s squared 

integral 

\ 


2.27A 

.534 

1.253 

89.5 

3 

2.274 

1.843 

.431 

1.012 

84.4 

5 

1.843 

1.492 

.351 

.824 

79.5 

7 

1.492 

1.209 

.283 

.664 

74.7 

9 

1.209 

.980 

.229 

.538 

70.5 

11 

.980 

.794 

.186 

.437 

66 9 

13 

.794 

.643 

.151 

.354 

63.8 

15 

.643 

.521 

.122 

.286 

61.1 

17 

.521 

.422 

.099 

.232 

59.2 

19 

.422 

.342 

.080 

.188 

57.4 


Wc next take up the question of whether the organism will Icam 
to eliminate by means of the goal-gradient principle alone a long 
blind alley more easily than a short one. This problem may be 
solved by a procedure closely similar to that just followed. In the 
first problem a turn to the right was assumed to involve the travers- 

* However, »ce the tcnninal note in the present chapter. 




A BEHAVIOR SYSTEM 

282 

ing of one unit of distance between the choiee and the attainment 
of the goal, and one second of delay in reinforeement, whereas a 
turn to the left involved traversing three units of distance and three 
seconds of delay in reinforcement. But suppose that instead the 
left turn entered a blind alley twice as long as the B choice shown 
in Figure 63, which would mean traversing five units of length and 
a delay of five seconds in reinforcement. By Table 33, a delay of 
five seconds will reduce the reaction potential to 1. 843(7 as com- 
pared with 2. 808(7 at one second. But 2.808(7 — 1.843(7 = .965. 
Dividing .965 by the square root of the sum of the squares of the two 
standard deviations involved, we have .965 -r- .426 = 2.26, which, 
by a table of the normal probability integral, yields an advantage 
of 98.8 per cent in favor of the shorter alternative path. But 98.8 
> 89.5. 

Thus we arrive at our next theorem (70, p. 36): 


THEOREM 104. Other things constant, the goal gradient will lend 
strongly to favor the elimination of a long blind alley as compared with 
the elimination of a short one. 


The first study we have been able to find on the relative ease of 
eliminating long versus short blind alleys was reported in a mono- 
graph by Joseph Peterson (77). On the basis of an ingenious study 
in which he used twenty-four rats he concluded that short blinds 
were more easily eliminated than long ones. Unfortunately, curtains 
in mazes were not used at that time so that Peterson’s animals 
probably were able to see the ends of his short blind alleys without 
entering them. In his main experiment, moreover, six out of the 
ten blind alleys actually showed less errors on the shorter blinds. 
Six years later, White and Tolman (29) took up the same problem 
in a wholly convincing manner, using a simplified maze with rela- 
tively long blinds possessing right-angled turns so that the subject 
could not see the blind end from the entrance. They based their 
conclusions on the beharior of fourteen rats given five trials per 
day for four days. Every day of the experiment fewer entries were 
made by the group of subjects as a whole on the long alley than on 
the short one. And upon the whole the advantage of the elimina- 
tion of the long alleys over the short ones increased as practice 
continued. The percentages of long versus short blind-alley en- 
trances for the several days were: day 1, 48; day 2, 40; day 3, 23; 
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day 4, 36. Thus the theoretical deduction is believed substantiated 
by empirical fact. 

Our third question concerns the relative ease of eliminating two 
blinds of the same length at the beginning and at the end of the 
maze respectively. This problem, again, is solved by methods quite 
analogous to those employed with the first problem. Consider the 
first and last blind alleys of the maze in Figure 63. Here, as in the 
case of the blind just considered, the difference in distance traversed 
and the delay in reinforcement between the blind-alley path and 
the shorter path is two units of distance and roughly two seconds in 
time. Thus there will be approximately five seconds of delay on the 
right turn and 5 + 2 or 7 seconds by the blind. By Table 33, the 
true path would have a reaction potential to the right turn of 
1.843(r, and one of 1.492<r to the left or incorrect turn. This yields 
a difference of .351<r, which corresponds to a choice probability 
of the elimination of the blind alley at the limit of practice of 
79.5 per cent. But 79.5 < 89.5. Thus we arrive at our next theorem: 

THEOREM 105. Other things constant^ the elimination of a blind 
alley at the beginning of a maze is more difficult, by the goal-gradient 
principle alone, than at the termination {goal end) of a maze. 

Moreover, a glance at the probability-of-choice values in Table 
33 at various distances from the terminus of a maze shows that 
elimination becomes progressively more difficult {10, p. 37). This 
yields the following theorem: 

THEOREM 106. Other things constant, the last blind alley of a 
maze will be eliminated first, by the goal-gradient principle alone, and 
the others progressively in a backward order, the first blind alley being 
eliminated last. 

The generally backward order of the elimination of blind alleys 
in maze learning was early noticed by experimentalists, among 
'vhom may be mentioned Carr and Peterson. Since 1917 
many other investigators using various sorts of mazes have verified 
the original observation, especially with homogeneous mazes on 
which the interpretation is somewhat clearer. Spence (27), assem- 
bling data from twelve mazes of this type ranging from six to 
fourteen units in length, found that the mean ranks of the alleys 
from easiest to most difficult blind-alley elimination for the first. 
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second, and third thirds of the mazes were 7.66, 4.36, and 3-58. 
a satisfactory statistical reliability between all three pairs of ditler- 
ences. Since a small-numbered rank means easy learning, this 
shows upon the whole a backward order of blind-alley elimination, 

though Other factors dearly enter. 

But the nutrvber of trials recjuired to complete the learning of a 
maze depends upon the most difficult single blind alley, and this 
(the first) depends upon the number of units following it in the 
maze. Thus by Table 33, a single-unit maze will yield at the limit 
of training 89.5 per cent correct choices, a five-unit maze will yield 
79.5 per cent correct choices, an eleven- and a nineteen-unit maze 
will yield 66.9 and 57.4 per cent successful choices respectively. 

Generalizing on these considerations we arrive at our next 
theorem (70, p. 37) : 

THEOREM \07. Othtr things constant, long multidirectional mazes 
{with many choice points) will be more difficult to learn, by the goal- 
gradient principle alone, than short ones. 

For many years it has been known in a general way that long 
mazes are more difficult to learn than are short ones, though we 
have not been able to find any study where a strict comparison 
is made of the difficulty of learning multidirectional mazes differing 
only in the number of blind alleys. As a sample of the available 
cwdence ^vc take the mean number of non-retracing errors made 

TABLE 34. ITje mean number of entrances into the first blind alley of five alternative- 
pathway mazes as a function of the length (number of blind alleys) of each. After 
tVarden and Cummings (27), 


Total number of 
blind alleys in mazes 

Mean number of entrances into 
the first blind alley of maze 

2 

B 8 


10.44 


8.44 


14.90 

10 

14.25 


in a smpk right-kft aliwnalive maae on the first blind as reported 
by Warden and Cummings (27). These are assembled in Table 34. 
It is well knorvn that in alternation maae learning the alternation 
of early umts is transferred more or less to the corresponding alter- 
nates of later umts, which complicates interpretation from this 
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point of view. Nevertheless it is evident that while the agreement 
is not precise, presumably in part because of the small number of 
animals used in each group, the tendency to agreement with the 
theory is clear. 

Closely related to the above is the question of the shape of the 
curve of correct choices at the various points as a function of their 



FIGURE 64. Graph representing the theoretically successful choices at the limit of 
tr'uning as a function of the delay in reinforcement, by the uncomplicated goal gradient 
as represented in column 6 of Tabic 33. Note that if they were counted from the an- 
terior of the maze, the numbering of the maze units would be reversed. 

distance from the point of reinforcement. We have secured this 
merely by plotting the probability values in the last column of Table 
33 as a function of the blind-alley position values as given in the first 
column. This appears as Figure 64. 

From an inspection of this graph we arrive at our next theorem. 

THEOREM 108. Other things constant., the per cent oj correct choices 
at the several choice points of a maze progressive!^' decreases under the 
influence of the goal gradient alone as the choice points are more remote 
from the point of reinforcement. 
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So far as we can discover, the problem of the curve of successful 
choices as a function of the number of blind alleys between the 
goal and any choice point has never before been raised, either 
theoretically or experimentally. Moreover, other things never are 
constant in such series. For one thing such chains, if purely hetero- 
geneous and plotted in terms of correct responses, will arch down- 
ward, and if purely homogeneous will arch upward (Chapter 6). 
Then of course there is the matter of spatial orientation, the frus- 
tration at the ends of the blinds (xvii), and so on. 

Because of the general bearing of the relationship of reaction 
time to reaction potential, as represented by the empirical equation 

(5), 

8.71 

(,Ea+ .599)*'”’ 


it follows from Table 33 that as the organism progresses in the 
reinforced trials during the learning of a maze, its speed of locomo- 
tion will progressively increase. 

From these considerations we arrive at our next two theorems: 


THEOREM 109. Other things conslanty as an organism repeatedly 
traverses a maze with reward at the posterior end^ the rate of locomotion 
will increase as a whole. 

THEOREM 110. Other things constant, i.e., apart from antedating 
and perseverative response-interferences, as an organism repeatedly 
traverses a maze with reward at the postaior end, the rate of locomotion 
through the later part of the maze will become progressively faster 
than that through the early part. 


simplest empirical evidence bearing on Theorems 
an ’ though it does not come from a situation involv- 
mg a series of Wmd alleys, is presented by the speed of rats running 
m a plain 40-foot runway. This may be seen in Figure 65. The 
faster running of the animals on days 6 and 7 as compared with 
TlJ ^ positions of the curves. Thus 

orapirically. The tilting up of both 
of ,h “ Pt^o^Wy dno to the homogeneous 

To rend r ^ } "inoh podtively generalizes the learning from 

Sianmr 61^ 1 "’ 5' ^™”ate positivefy (see 

Chapter 6). The goal gradient alone is therefore revealed by the 




Sections ofthe Runivay 


FIGURE 65. Graphic representation of the mean time required for fourteen albino 
rats to traverse the several segments of a straight 40-foot enclosed runway at two differ- 
ent stages of training, days 1 and 2 and days 6 and 7. From Hull (72, p. 404). 


Goal Orientation and Maze Learning 

In Chapter 8, Figure 60 illustrates a habit-family hierarchy with 
alternative paths in open space on a single side of a straight line 
from the starting point to the goal. We must here point out that 
according to the same theory (Theorem 107), other alternative 
paths in the same habit-family hierarchy exist in free space on the 
opposite side of the straight line; that an infinite number of paths 
of intermediate length pass between those alternative paths; and 
that at any given level of the habit-family hierarchy, a very hrge 
number of alternative potential paths of equal length exist which themselves 
do not constitute a complete hierarchy. 
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Consider, now, the behavior of an organism which has previously 
formed the visual habit-family hierarchies in open and relatively 
free space, on being placed in an enclosed maze. In traversing this 
maze from S to the goal or food box (G, Figure 63), the organism 
will form a locomotor habit corresponding to one of the grosser 
units of a habit-family hierarchy acquired in free space. Now in 
the past the organism has associated this sequence with directional 
movements such as those of the eyes, and with ordinary locomotion 
toward the goal object in space as performed at various points in 
its environment. Also, these associations have been followed by 
reinforcement, with the incidental action of the goal gradient. It 
accordingly follows that these guiding or pure-stimulus acts (ro) 
will be evoked in the organism while it is in the maze situation, and 



FIGURE 66. The heavy lines of these diagrams show five distinct pathways taken by 
the same rat through the open-aUey maze on as many consecutive trials, numbers 26 
to 30 inclusive. Reproduced from Dashiell (2, p. 25). 


as Stimuli will tend to arouse all of the responses characteristic of 
the habit-family hierarchy in free space. 

Suppose, (or example, that an animal finds itself in one of the 
open-alley mazes represented in Figure 66. The entrance is at the 
ower e t an corner. Now in this maze there are twenty distinct 
pa ways ^ * e goal, all of equal and minimal length. These 
twenty paths constitute a given level of the habit-family hierarchy; 
in constant value of reaction potential and, 

(14 nature of the oscillation function 

occaLns ’ “ different 

ne« tWei? considerations we arrive at our 


practice, numerous allernattve paths to the goal. ^ 
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Ample empirical evidence bearing on Theorem 111 was pub- 
lished by Dashicll in 1930 (2). A total of 27 animals were run on 
substantially the same form of experiment with various controls. 
In general Dashiell found all these animals taking many distinct 
paths to the goal at all stages of the practice. He stales, “Particu- 
larly worth noting are the trials numbered 22 to 42 inclusive of 
animal 11: in these 21 runs 13 different routes were included with 
only three cases of an immediate repetition. The eleven animals 
used in one series of 50 tests yielded an average of 7.5 distinct runs 
with either one or no error on each.*’ A convenient concrete illus- 
tration of the tendency to take alternative pathways through the 
open-alley maze is represented in Figure 66, which shows the con- 
secutive paths chosen by one rat on runs 26 to 30 inclusive, all of 
which are quite distinct. We therefore conclude that Theorem III 
is empirically substantiated. 

In Chapter 8 we deduced the principle concerning the spatial 
habit-family hierarchy; i.e., the principle that when an organism 
finds its way to a goal by means of any member of a spatial habit- 
family hierarchy (77) this habit is at once transferred to every 
member of the hierarchy in that general situation, and that in 
such a hierarchy the maximum transferred reaction is to paths 
whose initial segment makes a zero angular deviation from a 
straight line connecting with the goal (7J, p. 284) at any given point 
where the organism chances to be. From this principle a number of 
maze-behavioral laws follow at once. One of these concerns the 
tendency to enter goalward-pointing blind alleys (77, pp. 136 ff.). 

Thus we arrive at our next theorem: 

theorem 112. Other things constant and no additional motivation 
present^ spatially naive organisms which have been reinforced at a 
given goal in an enclosed maze will tend least to take blind alleys 
whose directions make an angle of 750 degrees with a straight line 
from the choice point to the goal, the chance increasing progressively 
to its maximum as this angular divergence decreases toward zero. 

Empirical evidence regarding the question of whether goal- 
pointing blind alleys do in fact have more entrances than those 
pointing away from the goal is unfortunately greatly complicated 
by other factors, especially by the goal gradient which plays a 
decisive role as already shown (Theorems 108, 109, 110). Ideally, 
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to secure empirical proof we would desire a set of alleys in which 
the goalward-pointing blinds and those pointing away from the goal 
are evenly distributed throughout the maze, which would equalize 
the mean effect of the goal gradient. Actually such a situation 
probably never has existed. The set of published data (25) which 
we shall now diseuss gives a mean rank for the goalward-pointing 



rioURt 67. Diagram of the Tolman-Honrik maaa (24, p. 43) vrith straight lines 
dtawn^m the goal boa to each choice point. The blind alleys are numbered from the 
P’ • The divergence ot the blind alley direction from the direction of the goal or food- 
boa was read off wtth a transparent protractor as U shown by the arrow and broken cir- 


cle at choice point 14. 


blind all^s of 5.2, whereas the mean rank of the non-goalward- 
potntmg bltnds is 8.6; this shows that the goalward-pointing blinds 
on e w o e a near the beginning of this maze and so have a 
mean excess of blind-alley entrances because of the goal gradient 
a tme, an not necessarily because of the goal orientation principle. 

’’Ir”? ‘n =>"= extremely valuable. They were 

'"•Txnees of 36 albino rats as secured by 

-I T graphs. Now, blind alleys 

rarely ^int either direcdy toward or away from the food box. In 

a the goal, 

straight lines were draivn from die food box to the choice points of 
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each of the fourteen blind alleys as shown in Figure 67. Then with 
a transparent protractor we measured the angular deviation of 
each line from the direction in which each blind alley was pointing. 

Our next task was to secure some blind-alley entrance values 
which were not distorted by goal gradient tendencies. By judicious 
search we found in Figure 67 seven combinations of consecutive 

TABLE 35. Table showng for a multidirectional maze, the goal gradient factor 
remaining relatively constant on the average, the tendency for blind alleys with a 
small angular divergence from a line to the goal (Figure 67) to have more entrances 
on the average than blind alleys with a larger angular divergence from a line to the 
goal. Based on the learning responses of 36 hungry animals on 17 reinforced trials. 
Compiled from measurements based on Tolman and Honzik (2S, p. 250). 


Distance 

from 

gbal 

Smaller angular divergence 
from goal direction 

Dis- 

tance 

from 

goal 

Larger angular divergence 
from goal direction 

Agree- 
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wth 

hypoth- 

esis 

Angular 

diver- 

gence 

No. of 

errors 

Mean 
no. of 

errors 

Angular 

diver- 

gence 

No. of 

errors 

Mean 
no. of 

errors 


135 

131.1 

131.1 

1 

180 

41.2 

44.9 

+ 





3 

154 

48.7 




90 

288.5 

288.5 

3 

154 

48 7 

50.6 

+ 





5 

180 

52 5 




90 

288.5 

188.2 

S 

180 

52.5 

52.5 
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108 

89.7 







6 

108 

89.9 

76.8 

7 

166 

101 2 

101 2 

- 

8 

117 

63.7 







9 

34 

333.4 

333.4 

8 

117 

63.7 

108.6 

+ 





10 

135 

153.6 



9 

34 

333.4 

290.3 

10 

135 

153 6 

153 6 

-h 

11 

56 

247.3 







11 

56 

247.3 


10 

135 

153.6 



12 

45 

348.4 

317.5 

14 

162 

217.3 

185.4 

+ 

13 

63 

357.2 







Mean 



232.3 




99.5 



blind alleys, in each combination (usually three in number) of 
which (1) the two alleys at cither side averaged the same number 
of steps from the goal as the alley lying between, and (2) cither 
the middle alley or the two at its side showed considerable differ- 
ence in the extent of goal pointing. These appear in detail in Tabic 
35. For example, the first set of blind alleys chosen were rcspcctiv cly 
1. 2, and 3 steps from the goal. The two extreme alleys (1 -f 3 = 4 
+ 2 = 2) average the same distance from the goal as the distance 
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of the middle alley. Thus the goal gradient effect of the two ex- 
treme alleys will, upon the whole, average the same as that of the 
middle alley. Then the blind-alley entrance scores of the two 
alleys were averaged and compared with the blind-alley entrances 
of the middle alley. This procedure yielded, in the case of the three 
blinds here considered, the error value of 44.9 for the alley of 
larger angular divergence from the goal direction, and of 131.1 for 
the alleys of smaller angular divergence. Incidentally these results 
agree so far as they go with Theorem 112. Moreover, an examina- 
tion of Table 35 will show that in all but one of the seven combina- 
tions the smaller angular divergence from a direct line to the goal 
has the larger number of blind-alley entrances. 

The average number of blind-alley entrances of the large-angled 
^oup of alleys is 99.5, and that of the small-angled group is 232.3. 
These valu« yield what we shall call the go.! orienMion index 
(UO.), which has a zero value for a zero effect and 100 for a 
maximum effect, thus: 


G.O. = loofl 

V 232.3/ 
= 100(1 - .428) 

= 100(.572) 

= 57.2. 


(58) 


oriemmro7eftc? u 6°®' 

effectively comoarinvdiffe ; . P^bably will not hold for 
sampling limitations for comparTnTthe 

conditions. Also it mav u ** ? maze under different 

away from the goal the that the farther 

other things equal, the lar^r wmI'k^ ’"''olved is situated, 

the larger the difference- this ' ' h number of entrances and 
or error values far from the blind-alley entrances 

Table 36. We conclude then that a® is shown by 

in Theorem 112, that *thp * ^vldely held view set forth 

definitely favor entrance 

stantiated by ample empirilitwence.'’"™'''* 

* Other veiy convincing evidence «« 
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Through the goal-gradient principle it follows that the tendency 
to choose short but untried paths^ angular divergence from the 
goal being held constant, will be an inverse function of the distance 
from the goal of the choice point in question (see Table 33, espe- 
cially fourth column). 

TABLE 36. The number of entrances into blind alleys as a function of the number of 
choice points from the goal, angular divergence from goal approximately constant. 
Based on the learning responses of 36 animals each given 17 reinforced trials. Com- 
piled from measurements based on Tolman and Honzik (25). 
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5 

52.5 

180“ 

1 

41 2 

4 

-h 11.3 
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10 

153.6 

125“ 

2 

131.1 

6 

+ 22.5 

166“ 

7 

101.2 

153“ 

3 

48.7 

4 

+ 52.5 

117“ 

8 

63.7 

108“ 

6 

89.9 

2 

- 26.2 

162“ 

14 

217.3 

166“ 

7 

101 2 

7 

-M16 1 

Means: 

152.8“ 

8.8 

117.6 

146.9“ 

3 8 

82.4 

5.0 

35.2 


Thus we arrive at our next theorem: 


theorem 113. With the angle of the entrance of a blind alley 
with a straight line to the goal constant^ the farther a choice point is 
from the goal in choice-point inlervalSy the smaller will be the difference 
in favor of a correct choice and the greater will be the tendency to 
enter the blind. 

In a sense Theorem 113 represents the goal gradient when un- 
complicated by the phenomena of goal orientation. The same 
Tolman-Honzik data from which Table 35 was derived yielded 
five pairs of alleys which had approximately the same goal-orien- 
tation angle but which stood at different distances from the goal. 
These data are assembled in Table 36. Four out of the five combi- 
nations show a greater number of blind-alley entrances as the 
distance from the goal becomes greater; the one exception had a 
choice-point difference of only 2. The mean distance from the goal 
is 5.0 maze units, whereas the mean number of blind-alley entrances 
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In regard to the reward value of an early turn in the maze 
toward the goal as contrasted with a later turn of the same angle, 
the distance traversed remaining the same, we have empirical 
evidence from a study by Yoshioka (30). Working in Tolman’s 
laboratory, Yoshioka trained 60 rats on two alternative pathways 
of approximately the same length. These pathways, shown in 
Figure 68, consisted of an outer triangular path with the turn at 



the top of the triangle 96 inches from the start, and an inner 
pentagonal path in which the same angular turn occurred as in the 
triangle, but approximately 48 inches nearer the choice point. After 
a certain number of “forced” alternating runs on the two paths 
with one of the two doors closed, the animals were given a large 
number of free choices. These trials yielded a significanUy larger 
number of choices of the pentagonal path with the early turn 
toward the goal than of the triangular path with the later turn 
toward the goal. A scries of additional related experiments by 
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Yoshioka (32) amply corroborate the same conclusions. Thus, 
despite the fact that the inner or pentagonal path is somewhat 
shorter and therefore preferred, Theorem 114 appears to be well 
substantiated. 


Anticipatory Turning and Maze Learning 

At this point we introduce a second principle, which was deduced 
much earlier and has already been utilized several times. This 
principle is to the effect that reactions which become conditioned 
to perseverative stimulus traces will, by the principle of stimulus 
generalization, be evoked by earlier and more intense phases of 
substantially the same stimulus traces; thus these traces yield 
anticipatory or antedating reactions. Wherever the tendency to 
turn at a given choice point is strong, e.g,, at choice point 5 in 
Figure 69, and the preceding stimuli arc similar, e.g., choice points 
4, 3, 2, and 1, this same turning reaction will tend to occur at one 
or another of the latter points whether they lead into a blind alley 
or the true pathway. 

From these considerations we arrive at our next theorem: 


THEOREM 115. In mazes where a given turning choice is strongly 
conditioned to perseverative stimulus traces and where closely similar 
stimuli and stimulus traces are encountered at antecedent positions, the 
same turning-choice reaction will tend to occur in advance of the rein- 
joTced choice point. 


Moreover in case the earlier choice points and the acts between 
them are alike, the stimulus situation close to the point where the 

erwTb “ be more like that 

evc^king the reinforced reaction (#4) than that at choice points 
arther away (#3. #2, or #1). This, on the basis of the gradiLt of 
e^v to mT' B^'^dient of the tend- 

“om thfnoLf r vTi!“™ “ ‘■'^'^'^des in distance 

TccoXr '*■' - reinforced. 

Accordingly we arrive at our next theorem: 

Ih^tmlnc^ i” "“T specified in Theorem 115, 

wil tZii / often maladaptive, 

will be maximal near the point at which the turning reaction is rein- 

distance from tl" pit 



MAZE LEARNING 


297 


Empirical evidence concerning the tendency for subjects to 
make anticipatory turning errors in the Hampton Court type of 
maze is yielded by a maze designed by Spence and Shipley (22). A 
diagram of this maze is shown in Figure 69. Cases of entering 
the right-hand blind alleys instead of the 
opposite ones, or going straight ahead, 
constituted anticipatory errors. Because of 
the position of the food-box in this maze 
the factor of goal orientation was also in- 
volved, but in such a way that it could be 
distinguished from the anticipatory turning 
tendency in a manner which presumably 
revealed their relationship. Spence and 
Shipley reported that during the first nine 
trials on this maze a perfect gradient of 
right-turning errors developed, the error 
maximum being at choice point 1 (the alley 
pointing directly to the goal), and the gra- 
dient decreasing as the alleys approached 
choice point 5. Since the angular diver- 
gence these blind alleys make with a 
straight line leading to the goal increases 
as the choice points approach the A posi- 
tion, the above gradient is, of course, ex- 
actly what would be expected by the goal 
orientationprinciple (Theorem 112). How- figure 69. Diagram of 
ever, as training continued the anticipa- the floor plan of the Spence- 
tory-turning tendency apparently began to 

interact with the goal orientation factor, begjnj with the figure l at 
In any case the errors at the second choice the lowest pair. The starting 
point decreased and an antedating gradi- 

ent leading to a high region at the final from Spcncc and 

blind alley (choice point 4) developed. The Shipley (22). 

four error values at this advanced stage of 

training were: 1 = 12.5; 2 = 7.5; 3 = 26.5; 4 = 53.5. The last 

three figures verify empirically Theorem 116. 

Jones and Taylor (?5) repeated the Spcncc-Shipicy experiment 
except that with fivo groups of animals they placed iheir go.il 
opposite the third choice point on the right side of the m.izc. With 




